COMPARING SEVERAL MEANS: ANOVA

LAST UPDATED: November 15, 2012 COMPARING SEVERAL MEANS: ANOVA

Objectives 2 Basic principles of ANOVA Equations underlying one-way ANOVA Doing a one-way ANOVA in R Following up an ANOVA: Planned contrasts/comparisons n Choosing contrasts n Coding contrasts Post hoc tests ANOVA as regression

Objectives 3 Basic principles of ANOVA Equations underlying one-way ANOVA Doing a one-way ANOVA in R Following up an ANOVA: Planned contrasts/comparisons n Choosing contrasts n Coding contrasts Post hoc tests ANOVA as regression

When and Why 4 When we want to compare means we can use a t-test. This test has limitations: You can compare only 2 means: often we would like to compare means from 3 or more groups. It can be used only with one predictor/independent variable. ANOVA Compares several means. Can be used when you have manipulated more than one independent variable. It is an extension of regression (the general linear model).

Generalizing t-tests 5 t-tests allow us to test hypotheses about differences between two groups or conditions (e.g., treatment and control). What do we do if we wish to compare multiple groups or conditions simultaneously? Examples: Effects of 3 different therapies for autism Effects of 4 different SSRIs on seratonin re-uptake Effects of 5 different body orientations on judgement of induced self-motion.

Why Not Use Lots of t-tests? 6 If we want to compare several means why don t we compare pairs of means with t-tests? Inflates the Type I error rate. 1 1 2 1 3 2 2 3 3 familywise error = 1 0.95 n

Advantages of ANOVA 7 Avoid inflation in error rate due to multiple comparisons Can detect an effect of the treatment even when no 2 groups are significantly different.

What Does ANOVA Tell Us? 8 Null hypothesis: Like a t-test, ANOVA tests the null hypothesis that the means are the same. Experimental hypothesis: The means differ. ANOVA is an omnibus test It test for an overall difference between groups. It tells us that the group means are different. It doesn t tell us exactly which means differ.

6-Step Process for ANOVA 9 1. State the hypotheses : µ = µ =... = µ 0 1 2 H : i, j [1,..., n] : µ µ A i j 2. Select the statistical test and significance level 3. Select the samples and collect the data 4. Find the region of rejection 5. Calculate the test statistic 6. Make the statistical decision H n

Theory of ANOVA 10 We calculate how much variability there is between scores Total sum of squares (SS T ). We then calculate how much of this variability can be explained by the model we fit to the data How much variability is due to the experimental manipulation, model sum of squares (SS M )... and how much cannot be explained How much variability is due to individual differences in performance, residual sum of squares (SS R ).

Theory of ANOVA 11 We compare the amount of variability explained by the model (experiment), to the error in the model (individual differences) This ratio is called the F-ratio. If the model explains a lot more variability than it can t explain, then the experimental manipulation has had a significant effect on the outcome (DV).

Theory of ANOVA 12 If the experiment is successful, then the model will explain more variance than it can t MS M will be greater than MS R

End of Lecture November 8, 2012

Objectives 14 Basic principles of ANOVA Equations underlying one-way ANOVA Doing a one-way ANOVA in R Following up an ANOVA: Planned contrasts/comparisons n Choosing contrasts n Coding contrasts Post hoc tests ANOVA as regression

Example (Two Means) 15 X 2 1 X -10.2 4.8-1.8 6.7 15.2-0.8-0.4 8.9 12.3 23.1-7.0 5.2 0.1-0.1-7.8 9.1 5.9 0.6-2.5-11.1 X 1 X 2 Mean 0.4 4.6 2.5 Std Dev 8.4 8.8 X G s 1 s 2

Reinterpreting the 2-Sample t-statistic 16 t 2 = n ( X ) 2 1 X2 2 s 2 p 2 2 The denominator sp is an estimate of the variance σ of the population, derived by averaging the variances within the two samples: 2 1 2 2 sp = ( s1 + s2 ) 2

Reinterpreting the 2-Sample t-statistic 17 t 2 = n ( X ) 2 1 X2 2 s 2 p 2 The numerator is also an estimate of the variance σ of the population, derived from the variance betwe en the sample means. s To see this, recall that s X = s = ns n 2 2 X For 2 groups, s 2 X = ((X 1 X G ) 2 + (X 2 X G ) 2 ), where X G = 1 2 (X + X ) 1 2 NB : Only 1 df in calculation of s X. Thus, s 2 X = X 1 1 2 (X + X ) 1 2 = 1 2 (X X ) 1 2 2 2 + X 2 1 2 (X + X ) 1 2 + 1 2 (X X ) 2 1 2 2 = 1 2 (X 1 X 2 )2

The F Distribution 18 F = t 2 = ( n X X 1 2) 2 2 2 s p Thus, under the null hypothesis, the numerator and denominator are 2 independent estimates of the same population variance σ. The ratio of 2 independent, unbiased estimates of the same variance foll ows an F distribution. 0.5 0.4 F distribution for 2 groups of size n=13 p(f) 0.3 0.2 0.1 0 0 2 4 6 8 10 F

Within and Between Variances 19 Recall that the variance is, by definition, the mean squared deviation of scores from their mean. Since the numerator of the F statistic estimates the variance from the deviations of group means, it is sometimes called the mean-square-between estimate. Since this estimate will include the effect of the independent variable (if any), it is also called the model mean-square MS M. Since the denominator of the F statistic estimates the variance from the deviations within groups, it is sometimes called the mean-square-within estimate. Since this deviation should only reflect the unmodeled noise (residual) it is also called the residual mean-square MS R. Thus F = MS M MS R

Generalizing to > 2 Groups 20 F = MS M MS R MS M = n i (X i X G ) 2, where X df G = 1 X M N i = 1 n T all scores N i X i T MS R = (n i 1)s i 2 df R

Sums of Squares Approach 21 F = MS M MS R MS M = SS M, where SS df M = n i (X i X G ) 2 M MS R = SS R, where SS df R = (n i 1)s i R 2 NB : SS T = SS M + SS R MS T MS M + MS R

Degrees of Freedom 22 The sample variance follows a scaled chi-square distribution, parameterized by the degrees of freedom. Thus the F distribution is a ratio of two chi-square distributions, each with different degrees of freedom. df M = k 1, where k = number of groups. df R = N T k, where N T = total number of subjects over all groups. = n i 1 i ( ) df T = df M + df R = N T 1

Properties of the F Distribution 23 Domain strictly non-negative Positively skewed 1 0.8 0.6 0.4 k = 3 n=2 n=5 n=10 n=100 µ = df R df R 2 Tail becomes lighter as df R 0.2 0 0 5 10 1 0.8 0.6 k = 10 n=2 n=5 n=10 n=100 0.4 Distribution Normal as df R,df M 0.2 0 0 2 4 6 8 10

The F Statistic 24 When H 0 is true (X 1 = X 2 = = X k ) we expect F 1. What do we expect when H is false? 0 F = MS M MS R MS M = n i (X i X G ) 2, where X df G = 1 X M N i = 1 n T all scores N i X i T MS R = (n i 1)s i 2 df R

Testing Hypotheses 25 F = MS M MS R Large values of F suggest that differences between the groups are inflating the MS M estimate of σ 2 reject H 0. 1 0.8 F distribution for 3 groups of size n=13 p(f) 0.6 0.4 0.2 0 0 2 4 6 8 10 F crit 3.32 for α =.05

Interpreting the F Ratio 26 estimate of treatment effect + between-group estimate of error variance F = within-group estimate of error variance

Example 27 Testing the effects of Viagra on libido using three groups: Placebo (sugar pill) Low dose viagra High dose viagra The outcome/dependent variable (DV) was an objective measure of libido.

The Data 28

29 Summary Table Source SS df MS F Model 20.14 2 10.067 5.12* Residual 23.60 12 1.967 Total 43.74 14

Objectives 30 Basic principles of ANOVA Equations underlying one-way ANOVA Doing a one-way ANOVA in R Following up an ANOVA: Planned contrasts/comparisons n Choosing contrasts n Coding contrasts Post hoc tests ANOVA as regression

One-Way ANOVA using R 31 When the Test Assumptions Are Met: Using lm(): viagramodel<-lm(libido~dose, data = viagradata) Using aov(): viagramodel<-aov(libido ~ dose, data = viagradata) summary(viagramodel)

Using lm() for ANOVAs 32 viagramodel<-lm(libido~dose, data = viagradata) Be careful that the independent variable is coded as a factor! If it is interpreted as a number, lm() will do a linear regression. By default we get contrasts of the second to last levels against the first level. For this experiment, this contrasts the low and high doses against the placebo.

ANOVA Assumptions 33 Independent random sampling Normal distributions Homogeneity of variance

34 Levene s Test for Homogeneity of Variance 1. Replace each score X, X,... with its absolute deviation from the sample mean: d 1i = X 1i X 1 d 2i = X 2i X 2 1i 2i 2. Now run an analysis of variance on d, d,...: F = MS M MS R 1i 2i MS M = SS M, where SS df M = n i (d i d G ) 2 M MS R = SS R 2, where SS df R = (n i 1)s di R

When variances are not equal across groups 35 If Levene s test is significant then it is unlikely that variances are homogeneous. Recall that when doing t-tests on a pair of means we applied the Welch-Sattertwhaite approximation to account for unequal variances. We can apply a comparable approximation for ANOVA by executing: oneway.test(libido ~ dose, data = viagradata)

Objectives 36 Basic principles of ANOVA Equations underlying one-way ANOVA Doing a one-way ANOVA in R Following up an ANOVA: Planned contrasts/comparisons n Choosing contrasts n Coding contrasts Post hoc tests ANOVA as regression

Why Use Follow-Up Tests? 37 The F-ratio tells us only that there is some effect of the IV. i.e. group means were different It does not tell us specifically which group means differ from which. We need additional tests to find out where the group differences lie.

How? 38 Orthogonal contrasts/comparisons Hypothesis driven Planned a priori Post hoc tests Not planned (no hypothesis) Compare all pairs of means

Planned Contrasts 39 Basic idea: The variability explained by the model (experimental manipulation, SS M ) is due to participants being assigned to different groups. This variability can be broken down further to test specific hypotheses about which groups might differ. We break down the variance according to hypotheses made a priori (before the experiment).

40 Rules When Choosing Contrasts Independent Contrasts must not interfere with each other (they must test unique hypotheses). Only two chunks Each contrast should compare only two chunks of variation. K 1 You should always end up with one less contrast than the number of groups.

Generating Hypotheses 41 Example: Testing the effects of Viagra on libido using three groups: Placebo (sugar pill) Low dose viagra High dose viagra Dependent variable (DV) was an objective measure of libido. Intuitively, what might we expect to happen?

Placebo Low Dose High Dose 3 5 7 2 2 4 1 4 5 1 2 3 4 3 6 Mean 2.20 3.20 5.00

How do I Choose Contrasts? 43 Big hint: In most experiments we usually have one or more control groups. The logic of control groups dictates that we expect them to be different from groups that we ve manipulated. The first contrast will typically be to compare any control groups (chunk 1) with any experimental conditions (chunk 2).

Hypotheses 44 Hypothesis 1: People who take Viagra will have a higher libido than those who don t. Placebo < (Low, High) Hypothesis 2: People taking a high dose of Viagra will have a greater libido than those taking a low dose. Low < High

Planned Comparisons 45

Another Example 46

Coding Planned Contrasts: Rules 47 Rule 1 Groups coded with positive weights compared to groups coded with negative weights. Rule 2 The sum of the positive weights should equal the sum of the negative weights. Rule 3 If a group is not involved in a comparison, assign it a weight of zero. Rule 4 If a group is singled out in a comparison, then that group should not be used in any subsequent contrasts. Note: Ignore rules in textbook.

Orthogonal Contrasts 50 Two contrasts are said to be orthogonal when the sum of the products of the weights for the two contrasts is 0.

Orthogonality and Type I Errors 52 If two contrasts are not orthogonal, whether you commit a Type I error on the first may be correlated with whether you commit a Type I error on the second. If the two contrasts are orthogonal, these two events should be uncorrelated. In either case, one still has inflation of experiment-wise Type I error. Despite this, many publications allow reporting of planned orthogonal contrasts without correction for multiple comparisons.

Planned Contrasts using R 53 To do planned comparisons in R we have to set the contrast attribute of our grouping variable using the contrast() function and then recreate our ANOVA model using viagramodel <- aov(libido~dose, data=viagradata) summary.lm(viagramodel) or viagramodel <- lm(libido~dose, data=viagradata) summary(viagramodel)

Objectives 54 Basic principles of ANOVA Equations underlying one-way ANOVA Doing a one-way ANOVA in R Following up an ANOVA: Planned contrasts/comparisons n Choosing contrasts n Coding contrasts Post hoc tests ANOVA as regression

Post Hoc Tests 55 Post-hoc tests are tests you decide to do after the data are collected and examined. Since you have already seen the data, regardless of how many tests you are interested in you must correct as though you are interested in all possible pairwise tests. How you conduct post hoc tests in R depends on which test you d like to do. Bonferroni can be done using the pairwise.t.test() function, which is part of the R base system. Tukey can be done using the glht() function in the multcomp() package.

56 Bonferroni-Dunn Test 56 For a given number of comparisons j the experiment-wise alpha will never be more than j times the per-comparison alpha. α jα EW PC set α PC = α EW j Experimentwise alpha 6 5 4 3 2 1 0 0.1 0.3 0.5 j=5 j α PC α 0.7 EW = 1 1 0.9 alpha (per comparison) ( α) j

Tukey s Honestly Significant Difference 57 Key Idea: Given k groups, consider the smallest and largest means. Ensure protection against Type I error when comparing these two means. This is guaranteed to protect against Type I error for all pairs. Treatment Group n Mean A 10 45.5 B 11 46.8 C 9 53.2 D 10 54.5

Tukey 58 For the Viagra data, we can obtain Tukey post hoc tests by executing: posthocs<-glht(viagramodel, linfct = mcp(dose = "Tukey")) summary(posthocs) confint(posthocs)

Output 59

Which Posthoc Test To Use 60 When doing a full set of post-hoc tests, Tukey is usually more powerful than Bonferroni. However, when doing a small number of planned comparisons, Bonferroni can be more powerful. Also, Tukey s assumptions can fail if the sample sizes are very different use Bonferroni in this case.

Objectives 61 Basic principles of ANOVA Equations underlying one-way ANOVA Doing a one-way ANOVA in R Following up an ANOVA: Planned contrasts/comparisons n Choosing contrasts n Coding contrasts Post hoc tests ANOVA as regression

ANOVA as Regression 62 ( ) outcome = model + error i libido = b + b high + blow + ε i 0 2 i 1 i i i

The Data 63

Placebo Group 64 libido = b + b high + blow + ε ( ) ( ) libido = b + b 0 + b 0 libido X i 0 2 i 1 i i i i = = b b 0 2 1 0 placebo 0

High Dose Group 65 libido = b + b high + blow + ε i 0 2 i 1 i i ( ) ( ) libido = b + b 1 + b 0 libido i i 0 2 1 = b + b 0 2 libido = b + b i 0 2 X = X + b high placebo 2 b = X X 2 high placebo

Low Dose Group 66 libido = b + b high + blow + ε i 0 2 i 1 i i ( ) ( ) libido = b + b 0 + b 1 i 0 2 1 libido i = b + b 0 1 libido = b + b i 0 1 X = X + b Low placebo 1 b = X X 1 low placebo

Output from Regression 67

Experiments vs. Correlation 68 ANOVA in linear regression (continuous IV): Used to assess whether the linear regression model is good at predicting an outcome. Remember that the model assumes that the DV is a linear function of the IV. ANOVA in testing means (discrete IV): Used to see whether different levels of the IV lead to differences in an outcome variable (DV). There is no assumption with regard to linearity of the DV as a function of the IV.