Analysis of Variance Inf. Stats. ANOVA ANalysis Of VAriance. Analysis of Variance Inf. Stats

Similar documents
compare time needed for lexical recognition in healthy adults, patients with Wernicke s aphasia, patients with Broca s aphasia

Statistiek II. John Nerbonne using reworkings by Hartmut Fitz and Wilbert Heeringa. February 13, Dept of Information Science

Example: Four levels of herbicide strength in an experiment on dry weight of treated plants.

Basics on t-tests Independent Sample t-tests Single-Sample t-tests Summary of t-tests Multiple Tests, Effect Size Proportions. Statistiek I.

Statistiek II. John Nerbonne. February 26, Dept of Information Science based also on H.Fitz s reworking

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

ANOVA: Analysis of Variation

One-way ANOVA Inference for one-way ANOVA

ANOVA Analysis of Variance

In ANOVA the response variable is numerical and the explanatory variables are categorical.

Wolf River. Lecture 19 - ANOVA. Exploratory analysis. Wolf River - Data. Sta 111. June 11, 2014

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01

ANOVA continued. Chapter 10

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

WELCOME! Lecture 13 Thommy Perlinger

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

ANOVA continued. Chapter 10

Multiple Regression. Inference for Multiple Regression and A Case Study. IPS Chapters 11.1 and W.H. Freeman and Company

Hypothesis testing: Steps

N J SS W /df W N - 1

Two-Way ANOVA. Chapter 15

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs

ANOVA continued. Chapter 11

Announcements. Unit 4: Inference for numerical variables Lecture 4: ANOVA. Data. Statistics 104

Multiple Pairwise Comparison Procedures in One-Way ANOVA with Fixed Effects Model

ANOVA: Comparing More Than Two Means

Statistiek I ATW, CIW, IK John Nerbonne, spreekuur H , di. 11:15-12

Lecture 15 - ANOVA cont.

Using SPSS for One Way Analysis of Variance

Multiple Comparisons

Introduction to Analysis of Variance. Chapter 11

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)

Wolf River. Lecture 15 - ANOVA. Exploratory analysis. Wolf River - Data. Sta102 / BME102. October 22, 2014

Example - Alfalfa (11.6.1) Lecture 14 - ANOVA cont. Alfalfa Hypotheses. Treatment Effect

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1

Simple Linear Regression: One Qualitative IV

Simple Linear Regression: One Quantitative IV

Wolf River. Lecture 15 - ANOVA. Exploratory analysis. Wolf River - Data. Sta102 / BME102. October 26, 2015

Analysis of Variance (ANOVA)

16.400/453J Human Factors Engineering. Design of Experiments II

Inference for Regression Inference about the Regression Model and Using the Regression Line

4:3 LEC - PLANNED COMPARISONS AND REGRESSION ANALYSES

One-Way ANOVA Source Table J - 1 SS B / J - 1 MS B /MS W. Pairwise Post-Hoc Comparisons of Means

Analysis of variance

Example - Alfalfa (11.6.1) Lecture 16 - ANOVA cont. Alfalfa Hypotheses. Treatment Effect

Sociology Research Statistics I Final Exam Answer Key December 15, 1993

Multiple t Tests. Introduction to Analysis of Variance. Experiments with More than 2 Conditions

ANOVA approach. Investigates interaction terms. Disadvantages: Requires careful sampling design with replication

Inference for Regression Simple Linear Regression

Swarthmore Honors Exam 2012: Statistics

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018

Difference in two or more average scores in different groups

Two-Sample Inferential Statistics

(Where does Ch. 7 on comparing 2 means or 2 proportions fit into this?)

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

13: Additional ANOVA Topics

COMPARING SEVERAL MEANS: ANOVA

Data Analysis and Statistical Methods Statistics 651

Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test

An Old Research Question

Hypothesis testing: Steps

One-way between-subjects ANOVA. Comparing three or more independent means

Week 14 Comparing k(> 2) Populations

ANOVA (Analysis of Variance) output RLS 11/20/2016

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

An inferential procedure to use sample data to understand a population Procedures

MANOVA is an extension of the univariate ANOVA as it involves more than one Dependent Variable (DV). The following are assumptions for using MANOVA:

Battery Life. Factory

FinalExamReview. Sta Fall Provided: Z, t and χ 2 tables

Analysis of variance (ANOVA) Comparing the means of more than two groups

STAT 5200 Handout #7a Contrasts & Post hoc Means Comparisons (Ch. 4-5)

PSY 216. Assignment 9 Answers. Under what circumstances is a t statistic used instead of a z-score for a hypothesis test

STAT Chapter 10: Analysis of Variance

Statistics and Quantitative Analysis U4320

STAT22200 Spring 2014 Chapter 5

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

The independent-means t-test:

16.3 One-Way ANOVA: The Procedure

Statistiek II. John Nerbonne. March 17, Dept of Information Science incl. important reworkings by Harmut Fitz

Tukey Complete Pairwise Post-Hoc Comparison

STAT 115:Experimental Designs

Exam Empirical Methods VU University Amsterdam, Faculty of Exact Sciences h, February 12, 2015

Introduction to Analysis of Variance (ANOVA) Part 2

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

ANOVA: Analysis of Variance

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

STATS Analysis of variance: ANOVA

Chapter 11 - Lecture 1 Single Factor ANOVA

STAT 501 EXAM I NAME Spring 1999

PSYC 331 STATISTICS FOR PSYCHOLOGISTS

Multivariate Analysis of Variance

One-way between-subjects ANOVA. Comparing three or more independent means

STAT 350 Final (new Material) Review Problems Key Spring 2016

Week 7.1--IES 612-STA STA doc

Unit 27 One-Way Analysis of Variance

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.

H0: Tested by k-grp ANOVA

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

Group comparison test for independent samples

Transcription:

Analysis of Variance ANOVA ANalysis Of VAriance generalized t-test compares means of more than two groups fairly robust based on F distribution, compares variance two versions single ANOVA compare groups along 1 dim, eg school classes two-way ANOVA, etc compare groups along > 1 dim, eg school classes and sex ÊÊÇÈÉÊ 1 Analysis of Variance Typical applications single ANOVA compare time needed for lexical recognition in healthy adults, patients with Wernicke s aphasia, patients with Broca s aphasia two-way ANOVA compare lexical recognition time in male and female in same three groups ÊÊÇÈÉÊ 2

Comparing Multiple Means for two groups: t-test testing at p = 005 shows significance 1 time in 20 if there is no difference in population mean (effect of chance) but suppose there are 7 groups, ie, `7 2 = 21 pairs caution: several tests (on same data) run the risk of finding significance through sheer chance ÊÊÇÈÉÊ 3 Phony Significance through Multiple Tests Example: Suppose you run three tests, always seeking a result significant at 005 The chance of finding this in one of the three is Bonferroni s family-wise α-level α FW = 1 (1 α) n = 1 (1 05) 3 = 1 (95) 3 = 1 857 = 0143 to guarantee a family-wise alpha of 005, divide this by number of tests Example: 005/3 = 0017 (set α at 0017) note: 0983 3 095 ANOVA indicated, takes group effects into account ÊÊÇÈÉÊ 4

Analysis of Variance based on F distribution F distribution Moore & McCabe, 73, pp435-445 measures difference between two variances (variance = σ 2 ) always positive, since variance positive two degrees of freedom interesting, one for s 1, one for s 2 F = s2 1 s 2 2 ÊÊÇÈÉÊ 5 F -Test vs F Distribution F = s2 1 s 2 2 used in F -test H 0 : samples from same distribution (s 1 = s 2 ) H a : samples from diff distribution (s 1 s 2 ) value 1 indicates same variance values near 0 or + indicate diff F -test very sensitive to deviations from normal ANOVA uses F distribution, but is different ANOVA F -test! ÊÊÇÈÉÊ 6

F Distribution Critical area for F -distribution at p = 005 Note symmetry: P( s2 1 s 2 2 > x) = P( s2 2 s 2 1 < 1 x ) ÊÊÇÈÉÊ 7 F -test Example: height group sample mean std dev size boys 16 180cm 6cm girls 9 168 4 is the difference in std dev significant? (α = 005) examine F = s 2 boys s 2 girls degrees of freedom: s boys 16 1 s girls 9 1 ÊÊÇÈÉÊ 8

F -test Critical Area (for 2-Tailed Test) P(F(15, 8) > f) = α 2 = 005 2 = 0025 P(F(15, 8) < f) = 1 0025 P(F(15, 8) < 41) = 0975 from M&M, Tbl E, p706 (no values directly for P(F(df 1, df 2 ) > f)) P(F(15, 8) < x) = α 2 (= 0025) = P(F(8, 15) > x ) = α 2 x = 1 x = P(F(8, 15) > x ) = 0025 x = 1 x = P(F(8, 15) > 32)(tables) P(F(15, 8) < 1 32 ) = 0025 P(F(15, 8) < 031) = 0025 Reject H 0 if F < 031 or F > 41 Here, F = 62 42 = 225 (no evidence of diff in distribution) ÊÊÇÈÉÊ 9 ANOVA Analysis of Variance (ANOVA) most popular statistical test for numerical data several types single one-way two-, three-, n-way multiple ANOVA, MANOVA, repeated measures examines variation between-groups sex, age, within-subject, within-groups overall automatically corrects for looking at several relationships (like Bonferroni correction) uses F test, where F(n, m) fixes n typically at number of groups (less 1), m at number of subjects (data points) (less number of groups) ÊÊÇÈÉÊ 10

One-Way ANOVA to Analyse Function of Reviews Example: Gisela Redeker identified three roles for literary book reviews in newspapers Taalbeheersing 21(4), 1999, 295-310: communicate emotional reactions, subjective opinions communicate expert opinion, objective facts motivate reading and purchasing of book She investigated whether different reviewers emphasized different roles: Tom van Deel (Trouw), Arnold Heumakers (de Volkskrant), and Carel Peeters (Vrij Nederland) stylistic elements indicate one of the three functions, eg, ik, maar nee, ben ik bang, lijkt, eerlijk gezegd, ik bedoel, indicate subjective opinions; logical connectives want, temeer dat, and quotes indicate an objective point of view; etc Nb validating link between style and perspectives is important (see Redeker) ÊÊÇÈÉÊ 11 Redeker on Literary Criticism s Functions Gisela Redeker investigates role of lit criticism, asking whether different critics did not differ in the degree to which they emphasize one or another role Sample: reviews of the same books (by Hermans, Heijne and Mulisch), all published 1989-92 Similar in length Data: relative frequency of, eg, reader-oriented elements (per 1, 000 words) We are comparing three averages, asking whether their is a difference Because she compared more than two averages ANOVA is needed ÊÊÇÈÉÊ 12

Relative Frequency of Reader-Oriented Elements elements Critic van Deel Heumakers Peeters evocative 124 108 151 questions 32 0 61 dramquotes 69 129 82 intensifiers 259 301 382 ref to reader 36 57 117 Totals 260 298 397 H 0 : µ 1 = µ 2 = µ 3 H a : µ 1 µ 2 or µ 1 µ 3 or µ 2 µ 3 Results: statistically significant difference (p < 002) Similar comparisons for subjective style, and argumentative style (differences present, not statistically significant) ÊÊÇÈÉÊ 13 One-Way ANOVA Question: Are exam grades of four groups of foreign students Nederlands voor anderstaligen the same? More exactly, are four averages the same? H 0 : µ 1 = µ 2 = µ 3 = µ 4 H a : µ 1 µ 2 or µ 1 µ 3 or µ 3 µ 4 ie, alternative: at least one group has different mean for the question of whether any particular pair is the same, the t-test is appropriate for testing whether all language groups are the same, pairwise t-tests will exaggerate differences (increase the chance of type I error) we want to apply 1-way ANOVA ÊÊÇÈÉÊ 14

Data: Dutch Proficiency of Foreigners Four groups of ten each: Groups Eur Amer Af ri Asia 10 33 26 26 19 21 25 21 31 20 15 21 Mean 250 219 231 213 Samp SD 814 661 592 690 Samp Variance 6622 4366 3499 4757 ÊÊÇÈÉÊ 15 Anova Conditions Assumption: normal distribution per group, check with normal quantile plot, eg, for Europeans (and to be repeated for every group): NormalQ-Q Plotoftoets nlvoor anderstalige 20 1 5 1 0 5 Expected NormalValue 00-5 -1 0-1 5-20 0 10 20 30 40 Observed Value ÊÊÇÈÉÊ 16

Anova Conditions Normal Quantile plot for all values: 3 NormalQ-Q Plot of toets nl voor anderstalige 2 1 Expected Normal Value 0-1 -2-3 0 10 20 30 40 Observed Value ÊÊÇÈÉÊ 17 Anova Conditions ANOVA assumptions: normal distribution per subgroup same variance in subgroups: least sd > one-half of greatest sd independent observations: watch out for test-retest situations! Check differences in SD s! (some SPSS computing ) Valid Variable Std Dev N Label Europa 814 10 America 661 10 Africa 592 10 Azie 690 10 ÊÊÇÈÉÊ 18

Visualizing Anova Is there any significant difference in the means (of the groups being contrasted)? 40 30 toets nlvoor anderstalige 20 10 0 N = 10 Europa 10 America 10 Africa 10 Azie gebied van afkomst Take care that boxplots sketch medians, not means ÊÊÇÈÉÊ 19 Sketch of Anova Groups 1 2 3 4 = I Eur Amer Afri Asia x 1,j x 2,j x 3,j x 4,j Sample Mean x 1 x i 9 >= I number of groups >; For any data point x i,j (x i,j x) = ( x i x) + (x i,j x i ) total residue = group diff + error ANOVA question: is it sensible to include the group ( x i )? ÊÊÇÈÉÊ 20

Two Variances Reminder of high-school algebra: (a + b) 2 = a 2 + b 2 + 2ab A AB A 2 B B 2 AB ÊÊÇÈÉÊ 21 Two Variances (a + b) 2 = a 2 + b 2 + 2ab (x i,j x) = ( x i x) + (x i,j x i ) (x i,j x) 2 = ( x i x) 2 + (x i,j x i ) 2 + 2( x i x)(x i,j x i ) Sum over elements in i-th group: (x i,j x) 2 = ( x i x) 2 + (x i,j x i ) 2 + 2( x i x)(x i,j x i ) ÊÊÇÈÉÊ 22

Two Variances Note that this term must be zero: 2( x i x)(x i,j x i ) Since: 2( x i x)(x i,j x i ) = 2( x i x) (x i,j x i ) and (x i,j x i ) = 0 ÊÊÇÈÉÊ 23 Sketch of Anova (x i,j x) 2 = ( x i x) 2 + (x i,j x i ) 2 (+ 2( x i x)(x i,j x i ) = 0) Therefore: (x i,j x) 2 = ( x i x) 2 + (x i,j x i ) 2 SST = SSG + SSE ÊÊÇÈÉÊ 24

Anova Terminology For any data point x i,j (x i,j x) = ( x i x) + (x i,j x i ) total residue = group diff + error (x i,j x) 2 = ( x i x) 2 + (x i,j x i ) 2 SST SSG + SSE Total Sum of Squares = Group Sum of Squares + Error Sum of Squares and DFT DFG + DFE (n 1) = (I 1) + (n I) Total Degrees of Freedom = Group Degrees of Freedom + Error Degrees of Freedom ÊÊÇÈÉÊ 25 Variances are Mean Squared Differences to Mean Note that (x i,j x) 2 n 1 is a variance, and likewise SST/DFT SSG/DFG (=MSG) & SSE/DFE (=MSE) In ANOVA, we compare MSG (variance betwee groups) and MSE (variance within groups), ie we measure F = MSG MSE If this is large, differences between groups overshadow differences within groups ÊÊÇÈÉÊ 26

Two Variances H 0 : µ 1 = µ 2 = µ 3 = µ 4 ANOVA: calculate MSG (σ 2 between groups) and MSE (σ 2 within groups), ie measure we F = MSG MSE If this is large, differences between groups overshadow differences within groups ÊÊÇÈÉÊ 27 Two Variances 1 estimate pooled variance of population (MSE) P Pi G dfi s2 i i G df i = (n 1 1)s2 1 +(n 2 1)s2 2 +(n 3 1)s2 3 +(n 4 1)s2 4 (n 1 1)+(n 2 1)+(n 3 1)+(n 4 1) = 9 6622+9 4366+9 3499+9 4757 9+9+9+9 = 59598+39294+31491+42813 36 = 4811 estimates variance in groups (using df ), aka within-groups estimate of variance 2 suppose H 0 true (a) then group have sample means µ, variance σ 2 /10, (& sd σ/ 10) (b) 4 means, 250, 219, 231, 213, where s = 163, s 2 = 266 (c) s 2 estimate of σ 2 /10, ie, 10 s 2 is estimate of σ 2 ( s 2 = 266) (d) this is between-groups variance (MSG) ÊÊÇÈÉÊ 28

Interpreting Estimates via F if H 0 true, then we have two variances: between-groups estimate s 2 b (266) and within-groups estimate s 2 w (4811) and their ratio s2 b s 2 follows an F distribution with w ( groups 1)dF from s 2 b (3), (n 4)dF from s 2 w (36) in this case, 2662 4811 = 055 P(F(3, 30) > 292) = 005, (see tables), so no evidence of nonuniform behavior ÊÊÇÈÉÊ 29 ANOVA Summary Summary to-date (exam results for NL voor anderstalige) Source df SS M SS F between-g 3 799 266 F(3,36) = 055 within-g 36 17319 481 Total 39 18118 P(F(3, 30) > 292) = 005, (see tables) so no evidence of nonuniform behavior ÊÊÇÈÉÊ 30

SPSS Summary - - - - - O N E W A Y - - - - - Variable NL_NIVO toets nl voor anderstalige By Variable GROUP gebied van afkomst Analysis of Variance Sum of Mean F F Source DF Squares Squares Ratio Prob Between Groups 3 799 266 55 65 Within Groups 36 17319 481 Total 39 18118 ÊÊÇÈÉÊ 31 Other Questions ANOVA has H 0 : µ 1 = µ 2 = = µ n But sometimes particular contrasts important eg, are Europeans better (in learning Dutch)? Distinguish (in reporting results): prior contrasts questions asked before data collected and analyzed post-hoc (posterior) questions questions after collection and analysis data-snooping is exploratory, cannot contribute to hypothesis testing ÊÊÇÈÉÊ 32

Prior Contrasts Questions asked before collection and analysis eg, are Europeans better (in learning Dutch)? Another formulation: is H a : µ Eur (µ Am = µ Afr = µ Asia ) where H 0 : µ Eur = (µ Am + µ Afr + µ Asia ) Reformulation: 0 = µ Eur + 033µ Am + 033µ Afr + 033µ Asia SPSS requires this (reformulated) version ÊÊÇÈÉÊ 33 Prior Contrasts in SPSS (the mean of) every group gets a coefficient sum of coefficients is zero a t-test is carried out, & two-tailed p value is reported (as usual) Eur Am Afr Azie Contrast 1-10 3 3 3 Pooled Variance Estimate Value S Error T Value DF T Prob Contrast 1-29 253-115 36 260 No significant difference here (of course) Note: prior contrasts are legitimate as hypothesis tests as long as they are formulated before collection and analysis ÊÊÇÈÉÊ 34

Post-hoc Questions Assume H 0 rejected: which means are distinct? Data-snooping problem: in large set, some distinctions are likely to be stat significant But we can still look (we just cannot claim to have tested the hypothesis) We are asking whether m 1 m 2 is significantly larger, we apply a variant of the t-test The relevant sd is p MSE/n (differences among scores), but there s a correction since we re looking at half the scores in any one comparison ÊÊÇÈÉÊ 35 SD in Post-hoc ANOVA Questions NB SD (among diff in groups i and j): sd δ = v u tmse +N j N = + N j s 481 10+10 40 10 + 10 = s 481 2 20 = r 2405 20 = 49/ 20 and the t value is calculated as p/c where p is the desired significance level and c is number of comparisons For pairwise comparisons, c = `I 2 ÊÊÇÈÉÊ 36

Post-hoc Questions in SPSS SPSS Post-Hoc Bonferroni searches among all groupings for statistically significant ones - - - - - O N E W A Y - - - - - Variable NL_NIVO toets nl voor anderstalige By Variable GROUP gebied van afkomst Multiple Range Tests: Modified LSD (Bonferroni) test w signif level 05 The difference between two means is significant if MEAN(J)-MEAN(I) >= 49045 * RANGE * SQRT(1/N(I) + 1/N(J)) with the following value(s) for RANGE: 395 - No two groups significantly different at 05 level Homogeneous Subsets (highest \& lowest means not sig diff) Group Azie America Africa Europa Mean 213 219 231 250 but in this case there are none (of course) ÊÊÇÈÉÊ 37 How to Win at ANOVA Note ways in which F ratio increases (becomes more significant) F = MSG MSE 1 MSG increases, differences in means grow larger 2 MSE decreases, overall variation grows smaller ÊÊÇÈÉÊ 38

Two Models for Grouped Data First model x i,j = µ + ǫ i,j x i,j = µ + α i + ǫ i,j no group effect each datapoint represents error (ǫ) around a mean (µ) Second model real group effect each datapoint represents error (ǫ) around an overall mean (µ) combined with a group adjustment (α i ) ANOVA: is there sufficient evidence for α i? ÊÊÇÈÉÊ 39 Next: Two-way ANOVA ÊÊÇÈÉÊ 40