ANOVA 3/12/2012. Two reasons for using ANOVA. Type I Error and Multiple Tests. Review Independent Samples t test

// ANOVA Lectures - Readings: GW Review Independent Samples t test Placeo Treatment 7 7 7 Mean... Review Independent Samples t test Placeo Treatment 7 7 7 Mean.. t (). p. C. I.: p t tcrit s pt crit s t p t p t s p t sp s p n p n t Frequently Need to Compare More than Two Groups Success versus Failure Feedac on susequent performance; If means differ is it ecause success feedac helps? Because failure feedac hurts? Both? Need a no feedac control condition Is there an effect of a therapeutic intervention and does it depend on dosage? Need for placeo and various dosage levels of treatment Does the effect of a therapeutic intervention depend upon prior symptom level? Need for placeo and treatment conditions for participants who have high versus low levels of prior symptoms Why NOT successive pairwise t s? Type I Error and Multiple Tests Two reasons for using ANOVA With an =., there is a % ris of a Type I error. Thus, for every hypothesis tests, you expect to mae one Type I error. The more tests you do, the more ris there is of a Type I error. ) Compare multiple means ) Do so in a way that doesn t inflate the type-i error rate. ANOVA uses one test with one alpha level to evaluate all the mean differences, therey avoiding the prolem of an inflated experiment-wise type-i error Test-wise alpha level The alpha level you select for each individual hypothesis test. Experiment-wise alpha level The total proaility of a Type I error accumulated from all of the separate tests in the experiment. e.g. three t-tests, each at =. -> an experimentwise alpha of.

// Research Designs Research Designs Lie the t-test, ANOVA is used when the independent variale in a study is categorical (e.g. treatment groups) and the dependent variale is continuous. A t-test is used when there are only two groups to compare. ANOVA is used when there are two or more groups Terminology In analysis of variance, an independent variale is called a factor. (e.g. treatment groups) The groups that mae up the independent variale are called levels of that factor. (e.g. low dose, high dose, control) A research study that involves only one factor is called a single-factor design. A research study that involves more than one factor is called a factorial design (e.g. treatment group and gender) ANOVA Notation H o : = = = H A : at least one population mean is different = a particular group; in total there are groups i = a particular person in the th group, i varies and n (initially assume equal n ) (n )=N i n ANOVA theory (formulas for elucidation not memorization) If H true, n nown, and n constant, then: n n We will estimate this ratio (called the F ratio) and examine whether it is close to one. If it is, then we will not reect the null hypothesis; if it is sufficiently different from one then we will reect the null hypothesis. n F t statistic ANOVA Intuition t = otained difference sample means difference expected y chance Random variation + effect Random variation ANOVA Example Suppose that a psychologist wants to examine learning performance under three temperature conditions: o, 7 o, 9 o. Three samples of suects are selected, one sample for each treatment condition. The suects are taught some asic material and given a short test. F statistic F = otained variance sample means variance expected y chance F = Treatment o Treatment 7 o Treatment 9 o = = = Factor: temperature Levels:, 7, 9

// ANOVA Example ANOVA Example Why don t we ust run three t-tests? compare o sample and 7 o sample compare o sample and 9 o sample compare 7 o sample and 9 o sample Treatment o Treatment 7 o Treatment 9 o = = = There is some variaility among the treatment means Is it more than we might expect from chance alone? Treatment o Treatment 7 o Quic review of how we calculate variance: s n Treatment 9 o = = = So what is our estimate of the random variaility due to chance? s Treatment o Treatment 7 o p Treatment 9 o = = = = = = This comes from the -treatment variailities Treatment o Treatment 7 o Treatment 9 o = = = = = = So what is our estimate of the random variaility due to chance? N Treatment o Treatment 7 o Treatment 9 o = = = = = = So what is our estimate of the random variaility due to chance? A summary of how much each datapoint varies from its treatment mean. Exactly lie pooled s from -sample t-test, except + groups.

// Treatment o Treatment 7 o Treatment 9 o = = = Treatment o Treatment 7 o Treatment 9 o = = = F statistic F = otained variance sample means variance expected y chance So what is the variance our treatment means? s This is the -treatment variaility. Bigger effect = greater variaility. Treatment o Treatment 7 o Treatment 9 o = = = ( ) ( ) ( ) So what is the variance our treatment means? We need to loo at the deviation each treatment mean and the grand mean Treatment o Treatment 7 o Treatment 9 o = = = ( ) ( ) ( ) So what is the variance our treatment means? We need to square those deviations Treatment o Treatment 7 o Treatment 9 o = = = n ( ) n ( ) n ( ) So what is the variance our treatment means? And weight them y the sample size of each treatment Treatment o Treatment 7 o Treatment 9 o = = = n ( ) n ( ) n ( ) n ( ) So what is the variance our treatment means? Now add them up to get and put your in the denominator

// Treatment o Treatment 7 o Treatment 9 o = = = n ( ) n ( ) n ( ) So what is the variance our treatment means? n ( ) Treatment o Treatment 7 o Treatment 9 o = = = So what is our test statistic? F(, w) Treatment o Treatment 7 o Treatment 9 o = = = What are its degrees of freedom? w N Tying the Logic of ANOVA ac with the theory F statistic F = otained variance sample means variance expected y chance F = -treatment variance -treatment variance n ( ) / w / w n F(, ) Between-group statistics Formulas for ANOVA (for you to memorize) ( ) n Within-group statistics n ( i ) i Total statistics F(, ) / / N / N ( ) Total N Total i i F-statistic Effect size estimate r Total Treatment o Treatment 7 o Treatment 9 o Step : Calculate each treatment mean and the grand mean = = =

// Treatment o Treatment 7 o Treatment 9 o Treatment o Treatment 7 o Treatment 9 o = = = = = = = = = n ( ) n ( ) n ( ) Step : Calculate the a) Get the for each treatment ) w N w. Step : Calculate the a) For each treatment mean, find the squared deviation from the grand mean, and multiply it y the treatment n. ) n ( ) Treatment o Treatment 7 o Treatment 9 o Treatment o Treatment 7 o Treatment 9 o = = = = = = Step : Calculate F F(,).. Step : Compare F value to F crit. If F > F crit, reect the null hypothesis F(,). F. crit F F crit Reect the null! Treatment o Treatment 7 o Treatment 9 o Step : Otain an effect size estimate how ig is the effect r. = = = Total % of the variation in learning performance in this study is explained y temperature ANOVA summary tale Source Between Within Total n ( ) N i ( ) i - N - N - / / w w /

// Treatment o ANOVA summary tale Source Between Within Treatment 7 o Treatment 9 o = = = F =.. Total total = = = F statistics The F Distriution Because F-ratios are computed from two variances, F values will always e positive numers. When H o is true, the numerator and denominator of the F-ratio are measuring the same variance. In this case the two sample variances should e aout the same size, so the ratio should e near. In other words, the distriution of F-ratios should pile up around.. H reected only from large F values, ut the F-test tests two-tailed hypotheses aout mean differences (they can e in any direction). The F Distriution F Distriution Positively sewed distriution: Concentration of values near, no values smaller than are possile Lie t, F is a family of curves. Need to use oth degrees of freedom (for and variances). The F Distriution So for an alpha of., the critical value of F with degrees of freedom, is reect null Our F =. =. Assumptions. The samples are independent simple random samples. The populations are normal. The populations have equal variance Exercise y hand The data depicted elow were otained from an experiment designed to measure the effectiveness of three pain relievers (A, B, and C). A fourth group that received a placeo was also tested. Placeo Drug A Drug B Drug C Is there any evidence for a significant difference groups? 7

// Exercise y hand Exercise y hand Placeo Drug A Drug B Drug C Placeo Drug A Drug B Drug C Source Between Within Total Source Between Within F = 9 Total 7 F crit =.7, so reect null Comparing F and t Comparing F and t What is the relationship F and t? What is the relationship F and t? Placeo Drug C F = t. Compute an independent-samples t statistic for this data t = -.. Compute an F statistic for this data F =.