ANOVA (Analysis of Variance) output RLS 11/20/2016

Size: px

Start display at page:

Download "ANOVA (Analysis of Variance) output RLS 11/20/2016"

Lenard Douglas Clark
5 years ago
Views:

1 ANOVA (Analysis of Variance) output RLS 11/20/ Analysis of Variance (ANOVA) The goal of ANOVA is to see if the variation in the data can explain enough to see if there are differences in the means. If there are differences, we infer that the means come from different populations. The main limitation of ANOVA The population model: y ij = µ + α i + ɛ ij Where: * y ij : response variable * µ: grand (overall) mean * α i : treatment (1) effect * ɛ ij : the (squared) residual (error) term ( (ŷ i y i ) 2 ), the total (sum) squared distance between each estimated y and the true (observed) value of y. A residual (error) is calculated as e i = ŷ i y i. The residuals are calculated the same way as we did in regression. In regression we calculated the coefficients of the linear equation (ŷ i = ˆβ 0 + ˆβ 1 x i ) and explored the significance of the slope. The same sort of thing is happening in ANOVA but only after we determine whether or not the means come from the same population. The same set of assumptions for regression hold for ANOVA and thus also need to be checked. 1. E(ɛ i ) = 0: the mean of the residuals is 0 2. V (ɛ i ) = σ 2 ɛ : the variance of the residuals is constant (the same) for all values of y. Also called constant variance, homogeneity of variance (means same variance) 3. Cov(ɛ i, ɛ j ) = 0: independence of residuals 4. ɛ i N(0, σ 2 ɛ ): Residuals have an approximately normal distribution with mean 0 and homogeneous variance The hypotheses H 0 : µ 1 = µ 2 = = µ t vs. H a : at least one µ i differs Sometimes the hypotheses are written as: H 0 : α 1 = α 2 = = α t = 0 vs. H a : H 0 not true Equations and the ANOVA table: The goal of ANOVA is to measure the variation between groups and within groups. Essentially the variation is measured in variances; the variance due to the treatment and the variation due to the residuals (errors). 1

2 The table you are creating is as follows: Source df SS MS F Pr(>F) Treatment t-1 SSTr MSTr MSTr/MSE P(F>Fcalc) Error (Residuals) N-t SSE MSE - - Total N-1 TSS Defining some components: t is the number of treatment groups (levels of the factor) N is the total number of observations in the experiment SS are the sums of squares (so SST r is sum of squares for treatment, etc.) MS are the mean squares (so MST r is mean square for treatment, etc.) ȳ i is the i th treatment group mean ȳ.. is the grand mean of all observations F is the test statistic; it requires two degrees of freedom (treatment and error) P (F > F calc) is the pvalue Using the following equations to calculate the values: N = n i ȳi ȳ.. = t = yi N SST r = n i (ȳ i ȳ.. ) 2 SSE = (y i ȳ.. ) 2 = s 2 i (n i 1) T SS = SST r + SSE MST r = SST r t 1 MSE = SSE N t F calc = MST r MSE Rejection is usually figured out with a pvalue but you will also learn the critical value approach using an F table. Critical value approach Reject H 0 if F calc F α,dftrt,df error pvalue approach The F test is always a one-tailed test. The pvalue is calculated as pvalue = P (F > F calc ) In R: pf(fcalc,dftrt,dfe,lower.tail=f) or 1-pf(fcalc,dftrt,dfe), but we will be using an analysis with the entire above table computed. An experiment was conducted 1 to test the effects of nitrogen fertilizer on lettuce production. Five rates of ammonium nitrate were applied to four replicate plots in a completely randomized design (CRD). salad=read.csv(" head(salad) nitrogen lettuce Dr. B. Gardner, Department of Soil and Water Science, University of Arizona. 2

3 boxplot(lettuce~nitrogen,main='boxplot of lettuce production',data=salad,xlab='nitrogen',ylab='lettuce production (heads)') Boxplot of lettuce production Lettuce production (heads) Nitrogen ybari=with(salad,tapply(lettuce,nitrogen,na.rm=t,mean)) s2i=with(salad,tapply(lettuce,nitrogen,na.rm=t,var)) si=sqrt(s2i) ni=rep(with(salad,length(lettuce[nitrogen==0])),each=5) N=sum(ni) t=nlevels(factor(salad$nitrogen)) ybar=sum(ybari)/t rbind(n,t,ybar) [,1] N 20.0 t 5.0 ybar cbind(1:t,ybari,s2i,si,ni) ybari s2i si ni Is there sufficient evidence that the treatment is effective? [Same as asking if there is at least one mean that is different]. Do an ANOVA and report the test statistic, pvalue, result, and conclusion in context. 3

4 By hand: ȳi yi ȳ.. = t = N = = SST r = n i (ȳ i ȳ.. ) 2 = 4( ) 2 + 4( ) 2 + 4( ) 2 + 4( ) 2 + 4( ) 2 = SSE = (y i ȳ.. ) 2 = s 2 i (n i 1) = (4 1)( ) = 3338 T SS = SST r + SSE = = MST r = SST r t 1 = = MSE = SSE N t = = F calc = MST r MSE = = pvalue = P (F > F calc ) = Source df SS MS F Pr(>F) Treatment Error (Residuals) Total Reject H 0 if F calc F α,dftrt,df error where F α,dftrt,df error = therefore we reject H 0. The treatment (ammonium nitrate) is significant in terms of lettuce production. With R Create a model, similar to how it is done in regression. The ANOVA output table is calculated by the anova model command and its output displayed with a summary command. First, create the model with aov(), usually naming it so that you can call it in the summary() command. This could also be done with lm() and anova() but many times the multiple comparison that you would do (part 2 of this document) requires use of the aov() model with summary(). The syntax for aov() with a name: fit=aov(y~x,data= ) Then display the ANOVA results from aov() with summary() summary(fit) salad.fit=aov(lettuce~factor(nitrogen),data=salad) summary(salad.fit) Df Sum Sq Mean Sq F value Pr(>F) factor(nitrogen) ** Residuals Signif. codes: 0 '***' '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 4

5 Checking assumptions with diagnostic graphs res=rstudent(salad.fit); pred=fitted(salad.fit) # Assumption 1: histogram should be centered around (approximately) 0, # or the mean of the residuals should be approximately = 0 hist(res,main='histogram of residuals') Histogram of residuals Frequency res # use a boxplot too boxplot(res,main='boxplot of residuals') Boxplot of residuals mean(res) [1]

6 # Assumption 2: plot of x=predicted and y=residuals should have no discerable pattern (random scatter) plot(pred,res,main=' Residuals vs. Predicted'); abline(0,0) Residuals vs. Predicted res pred # Assumption 3: independence of residuals -- no need to check; assume it's met # Assumption 4: normality of residuals so histogram should be approximately symmetric/bell-shaped # or QQplot (normal probability plot) where most points should be mostly along y=x line qqnorm(res,main='qqplot of Residuals'); qqline(res) QQPlot of Residuals Sample Quantiles Theoretical Quantiles 6

7 2. Multiple Comparisons The main limitation with ANOVA is that it only answers the question, is there at least one mean that is different? It does not state where the significant differences are; it does not tell us which means are best. Multiple comparisons are only to be performed when we reject the null hypothesis from the ANOVA analysis. We do multiple comparisons to detect where the significant differences between population means are, find the best treatment groups. We cannot just do multiple 2-sample CIs (t-tests) for independent means that we learned recently. Rather, we will do modified versions of them. Why can we not just do the 2-sample CIs (t-tests) for independent means that we learned recently? The reason is that when we do the CIs that way, each CI has a significance level of 5%. That is, each one has an α = 0.05, where α is the Type I error, rejecting the null hypothesis (H 0 ) when it is true. If we have 3 CIs to look at, each one with 5% significance level, doing the tests simultaneously for an experiment means that the significance level for the whole experiment is the sum of each CI s significance level, or (3)5% = 15%. That means we risk rejecting a true hypothesis we should have kept 15% of the time, rather than 5%. So, by doing a special analysis, called a multiple comparison, it makes adjustments for doing more than one 2-sample CI within an experiment that had significant results in its ANOVA analysis. There are many types of multiple comparisons available but the one we will do is called Fisher s Least Significant Difference (LSD). To perform any multiple comparison (by hand, which we won t do but this is what is happening in R), follow these steps: 1. Calculate the mean of each treatment group ȳ i 2. Calculate the absolute value of the difference between each unique pair of means. ȳ i ȳ j 3. Calculate Fisher s LSD statistic. LSD ij = t t,α 2MSE n 4. Compare the difference in means to the LSD statistic. If ȳ i ȳ j LSD i,j, then the pair of means is declared statistically significant, or that they are significantly different. From there, you can look at the means and see which ones are smaller or larger than one another to determine which one(s) are best. In R The output provides grouping indicators, using lower case letters. If a group is the only one with the letter (a), then it is considered different from the others. If more than one group have the letter (a), then they are is considered different from the others with different letters but not different from each other. If all groups have the same letter, then there are no significant differences between groups. To perform the multiple comparison, you will have to install the package where the commands are. To do that, enter the following into the console: install.packages("agricolae"). Then load the package (code listed below) with the library() command. Occasionally, the LSd.test() command is picky. Sometimes it won t produce output so try using: lsd=lsd.test(fit,"factorvar",group=t,console=t) and print(lsd) Sometimes it won t run because and you will need to input some values such as: SSE=deviance(fit) df.e=df.residuals(fit) MSE=SSE/df.e and LSD.test(response,factor,dfe,MSE,group=T,console=T) 7

8 library(agricolae) LSD.test(salad.fit,"factor(nitrogen)",group=T,console=T) Study: salad.fit ~ "factor(nitrogen)" LSD t Test for lettuce Mean Square Error: factor(nitrogen), means and individual ( 95 %) CI lettuce std r LCL UCL Min Max alpha: 0.05 ; Df Error: 15 Critical Value of t: Least Significant Difference Means with the same letter are not significantly different. Groups, Treatments and means a a a a b According to the grouping, there are four nitrogen levels 200, 150, 100, and 50, that share the letter a and nitrogen level 0 has the letter b. What that tells us is that nitrogen level 0 is significantly different from the other levels. Since nitrogen levels 50 through 200 aren t different from each other, there would probably be no need to use mroe than the nitrogen level 50. 8

SLR output RLS. Refer to slr (code) on the Lecture Page of the class website.

SLR output RLS. Refer to slr (code) on the Lecture Page of the class website. SLR output RLS Refer to slr (code) on the Lecture Page of the class website. Old Faithful at Yellowstone National Park, WY: Simple Linear Regression (SLR) Analysis SLR analysis explores the linear association