Statistiek II. John Nerbonne. February 26, Dept of Information Science based also on H.Fitz s reworking

Similar documents
Statistiek II. John Nerbonne using reworkings by Hartmut Fitz and Wilbert Heeringa. February 13, Dept of Information Science

compare time needed for lexical recognition in healthy adults, patients with Wernicke s aphasia, patients with Broca s aphasia

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

Analysis of Variance Inf. Stats. ANOVA ANalysis Of VAriance. Analysis of Variance Inf. Stats

ANOVA: Analysis of Variation

Example: Four levels of herbicide strength in an experiment on dry weight of treated plants.

Two-Way ANOVA. Chapter 15

Statistiek II. John Nerbonne. March 17, Dept of Information Science incl. important reworkings by Harmut Fitz

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)

Analyses of Variance. Block 2b

1 The Classic Bivariate Least Squares Model

Basics on t-tests Independent Sample t-tests Single-Sample t-tests Summary of t-tests Multiple Tests, Effect Size Proportions. Statistiek I.

psyc3010 lecture 2 factorial between-ps ANOVA I: omnibus tests

Introduction to Analysis of Variance. Chapter 11

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Factorial Analysis of Variance

Inference for the Regression Coefficient

An Old Research Question

Unbalanced Data in Factorials Types I, II, III SS Part 1

Econ 3790: Business and Economic Statistics. Instructor: Yogesh Uppal

ANOVA: Analysis of Variance

In ANOVA the response variable is numerical and the explanatory variables are categorical.

ANOVA (Analysis of Variance) output RLS 11/20/2016

ANOVA: Comparing More Than Two Means

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

ANCOVA. Lecture 9 Andrew Ainsworth

ANOVA continued. Chapter 11

Battery Life. Factory

Factorial Analysis of Variance

Analysis of Variance: Repeated measures

Difference in two or more average scores in different groups

4.1. Introduction: Comparing Means

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01

Variance Decomposition and Goodness of Fit

ANOVA continued. Chapter 10

Linear regression. We have that the estimated mean in linear regression is. ˆµ Y X=x = ˆβ 0 + ˆβ 1 x. The standard error of ˆµ Y X=x is.

Chapter 7 Factorial ANOVA: Two-way ANOVA

Topic 9: Factorial treatment structures. Introduction. Terminology. Example of a 2x2 factorial

Analysis of Variance (ANOVA)

Wolf River. Lecture 15 - ANOVA. Exploratory analysis. Wolf River - Data. Sta102 / BME102. October 22, 2014

Unit 27 One-Way Analysis of Variance

ANOVA continued. Chapter 10

1 Use of indicator random variables. (Chapter 8)

ANOVA: Analysis of Variance

Wolf River. Lecture 15 - ANOVA. Exploratory analysis. Wolf River - Data. Sta102 / BME102. October 26, 2015

Wolf River. Lecture 19 - ANOVA. Exploratory analysis. Wolf River - Data. Sta 111. June 11, 2014

One-way between-subjects ANOVA. Comparing three or more independent means

MANOVA is an extension of the univariate ANOVA as it involves more than one Dependent Variable (DV). The following are assumptions for using MANOVA:

Independent Samples ANOVA

PSY 216. Assignment 9 Answers. Under what circumstances is a t statistic used instead of a z-score for a hypothesis test

Chapter 10. Design of Experiments and Analysis of Variance

1. What does the alternate hypothesis ask for a one-way between-subjects analysis of variance?

22s:152 Applied Linear Regression. Chapter 8: 1-Way Analysis of Variance (ANOVA) 2-Way Analysis of Variance (ANOVA)

UNIVERSITY OF TORONTO. Faculty of Arts and Science APRIL - MAY 2005 EXAMINATIONS STA 248 H1S. Duration - 3 hours. Aids Allowed: Calculator

22s:152 Applied Linear Regression. Take random samples from each of m populations.

T-Test QUESTION T-TEST GROUPS = sex(1 2) /MISSING = ANALYSIS /VARIABLES = quiz1 quiz2 quiz3 quiz4 quiz5 final total /CRITERIA = CI(.95).

BIOL Biometry LAB 6 - SINGLE FACTOR ANOVA and MULTIPLE COMPARISON PROCEDURES

1. The (dependent variable) is the variable of interest to be measured in the experiment.

ANOVA Analysis of Variance

Inference for Regression

Sociology 6Z03 Review II

Note: k = the # of conditions n = # of data points in a condition N = total # of data points

Chap The McGraw-Hill Companies, Inc. All rights reserved.

Chapter 11 - Lecture 1 Single Factor ANOVA

One-way between-subjects ANOVA. Comparing three or more independent means

Variance Decomposition in Regression James M. Murray, Ph.D. University of Wisconsin - La Crosse Updated: October 04, 2017

Statistics For Economics & Business

CHAPTER 7 - FACTORIAL ANOVA

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

Sampling Distributions: Central Limit Theorem

Chapter 11 - Lecture 1 Single Factor ANOVA

Sleep data, two drugs Ch13.xls

Lecture 15 - ANOVA cont.

Hypothesis testing: Steps

R 2 and F -Tests and ANOVA

Example - Alfalfa (11.6.1) Lecture 16 - ANOVA cont. Alfalfa Hypotheses. Treatment Effect

Week 7.1--IES 612-STA STA doc

Biostatistics for physicists fall Correlation Linear regression Analysis of variance

10/31/2012. One-Way ANOVA F-test

Data Analysis and Statistical Methods Statistics 651

STAT Chapter 10: Analysis of Variance

Extensions of One-Way ANOVA.

Two-Way Factorial Designs

Lecture 6 Multiple Linear Regression, cont.

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:

Extensions of One-Way ANOVA.

Simple Linear Regression: One Quantitative IV

Example - Alfalfa (11.6.1) Lecture 14 - ANOVA cont. Alfalfa Hypotheses. Treatment Effect

Analysis of Variance: Part 1

Lectures on Simple Linear Regression Stat 431, Summer 2012

Inference for Regression Simple Linear Regression

Inference for Regression Inference about the Regression Model and Using the Regression Line

The simple linear regression model discussed in Chapter 13 was written as

Biostatistics 380 Multiple Regression 1. Multiple Regression

SCHOOL OF MATHEMATICS AND STATISTICS

Prepared by: Prof. Dr Bahaman Abu Samah Department of Professional Development and Continuing Education Faculty of Educational Studies Universiti

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

Diagnostics and Remedial Measures

Announcements. Statistics 104 (Mine C etinkaya-rundel) Diamonds. From last time.. = 22

Transcription:

Dept of Information Science j.nerbonne@rug.nl based also on H.Fitz s reworking February 26, 2014

Last week: one-way ANOVA generalized t-test to compare means of more than two groups example: (a) compare frequencies of stylistic elements in three book reviews (b) compare Dutch proficiency test results of four groups of foreigners assumptions of (i) normality (ii) and similar standard deviations in each group (iii) independent samples partitioning of total variance (SST) into between-groups variance (SSG) and error variance (SSE): SST = SSG + SSE based on F -distribution: F = MSG MSE Today: factorial ANOVA

Factorial ANOVA Like one-way ANOVA, but more than one factor (aka n-way ANOVA) compares means of different groups based on F -distribution: F = s2 1 s2 2 always positive two kinds of degrees of freedom: dfs1, df s2 value of 1 indicates same variance, values near 0 or + indicate difference uses F -distribution: compare variances among means with random variability inside the groups one-way ANOVA, n-way ANOVA F -test! assumes near-normal distribution in all groups standard deviations in all groups roughly equal ( sd i sd j 2)

Factorial ANOVA Why two-way ANOVA? why not just two one-way ANOVAs? efficient in the number of experiments and subjects needed

Factorial ANOVA Why two-way ANOVA? why not just two one-way ANOVAs? efficient in the number of experiments and subjects needed Suppose we want to measure effect of calcium and magnesium intake on blood pressure Two-way ANOVA: Calcium Magnesium L M H L 1 2 3 M 4 5 6 H 7 8 9 two-way design results in 9 groups assign 9 subjects to each group hence 81 subjects required

Factorial ANOVA Why two-way ANOVA? why not just two one-way ANOVAs? efficient in the number of experiments and subjects needed Suppose we want to measure effect of calcium and magnesium intake on blood pressure Two one-way ANOVAs: Calcium Magnesium L M H M 1 2 3 Magnesium Calcium L M H M 1 2 3 two one-way designs result in 6 groups assign 15 subjects to each group hence 90 subjects required and: only 15 subjects per level compared with 27 in two-way design

Factorial ANOVA Why two-way ANOVA? why not just two one-way ANOVAs? efficient in the number of experiments and subjects needed combining two experiments into one improves accuracy: increases number of data points per level decreases SE (standard error of the mean): standard deviation of sample mean: σ in one-way ANOVA: 15 = 0.26σ σ in two-way ANOVA: 27 = 0.19σ σ n Hence, sample mean responses are less variable in two-way design

Factorial ANOVA Why two-way ANOVA? why not just two one-way ANOVAs? efficient in the number of experiments and subjects needed combining two experiments into one improves accuracy (increases n, decreases SE) opportunity to study interaction: E.g., age and subtype of cancer have independent effects on mortality: breast cancer more treatable than other forms of cancer in general, cancer more treatable with young age but these are reversed in some combinations, e.g., breast cancer in young women particularly aggressive and dangerous. Interaction requires care!

Types of interaction Two drugs A and B administered in doses 0 and 1. Dependent measure: blood level of some hormone. drugs show no effect either separately or in combination no interaction null hypothesis true

Types of interaction Two drugs A and B administered in doses 0 and 1. Dependent measure: blood level of some hormone. both drugs have an effect combined effect is additive no interaction

Types of interaction Two drugs A and B administered in doses 0 and 1. Dependent measure: blood level of some hormone. both drugs have an effect combined effect is stronger than the sum of separate effects interaction

Types of interaction Two drugs A and B administered in doses 0 and 1. Dependent measure: blood level of some hormone. both drugs have the same effect as previously! when combined, the two drugs cancel each other out interaction

Factorial ANOVA: partitioning the variance As in one-way ANOVA: SST = SSG + SSE Total Sum of Squares Group Sum of Squares Error Sum of Squares SSG: aggregate measure of differences between groups SSE: aggregate measure of random variability inside groups But: in n-way ANOVA several factors contribute to between-groups variance (SSG) To measure the effect of different factors, we partition the SSG into components which correspond to these factors

Factorial ANOVA: partitioning the variance For example: two factors A & B, then SSG partitions into: SSG = SS A + SS B + SS A B main effect factor A main effect factor B interaction effect In one-way ANOVA: SSG = In factorial ANOVA: SS A = number of levels in factor A. I N i (x i x) 2 i=1 I A i=1 N i (x i x) 2 where I A is the Note: three factors A, B, C induce four interaction sum of squares: SS A B, SS A C, SS B C, SS A B C

Factorial ANOVA: degrees of freedom Degrees of freedom are partitioned similarly: DFT = (DF A + DF B + DF A B ) + DFE }{{} DFG In one-way ANOVA: DFG = I 1 (I = number of groups) In two-way ANOVA: DFG = (I A 1) + (I }{{} B 1) + (I }{{} A 1) (I B 1) = I }{{} A I B 1 = I 1 DF A DF B DF A B

Factorial ANOVA: mean squares In factorial ANOVA we obtain several mean squares between groups (here 3): MS A = SS A DF A (factor A) MS B = SS B DF B (factor B) MS A B = SS A B DF A B (interaction) Hence, there are also three F -values F A, F B, and F A B for which we test significance!

Factorial ANOVA schematically One-way ANOVA: Two-way ANOVA: SST SST SSG SSE SSG SSE SS A SS B SS AB MSG MSE MS A MS B MS AB MSE F-value F A F B F AB

Factorial ANOVA schematically One-way ANOVA: Two-way ANOVA: SST SST SSG SSE SSG SSE SS A SS B SS AB MSG MSE MS A MS B MS AB MSE F-value F A F B F AB

Factorial ANOVA example Many studies with children and adults (across languages) show: Object-relative clauses are more difficult to produce/comprehend than subject-relative clauses. For example: Obj-Rel: There is the man [that the dog bit at the park yesterday]. Subj-Rel: There is the boy [that hit the cricket ball over the fence]. Kidd, Brandt, Lieven & Tomasello (Language and Cognitive Processes, 22(6), 2007) investigated what makes object-relative clauses easier to process.

Factorial ANOVA example Many studies with children and adults (across languages) show: Object-relative clauses are more difficult to produce/comprehend than subject-relative clauses. For example: Obj-Rel: There is the man [that the dog bit at the park yesterday]. Subj-Rel: There is the boy [that hit the cricket ball over the fence]. Kidd, Brandt, Lieven & Tomasello (Language and Cognitive Processes, 22(6), 2007) investigated what makes object-relative clauses easier to process.

Factorial ANOVA example Task: 3 4 year-old children had to repeat sentences with relative clauses from an experimenter ( parrot game ) Two kinds of lexical manipulations: Pronominal subjects versus full NPs: This is the boy that you saw at the shop on Saturday. This is the boy that the man saw at the shop on Saturday. Animate versus inanimate head nouns: This is the football that he kicked in the garden yesterday. This is the dog that he kicked in the garden yesterday.

Animacy, pronouns and repetition accuracy Design: Four kinds of sentences shown: Head noun RC-subject Animate Inanimate pronominal pronoun + anim. head pronoun + inanim. head full NP NP + animate head NP + inanimate head Extras: KBLT controlled test sentences for length in words and syllables. Each child saw four different items of each type. Measure: exact repetitions were scored as 1 minor modifications (e.g., tense/aspect) as 0.5 ungrammatical or different syntax as 0

KBLT data and design Label Head noun RC-subject Score ANP animate NP 0.23 ANP animate NP 0.19 ANP animate NP 0.14 ANP animate NP 0.14 INP inanimate NP 0.25.... APro animate pronoun 0.63.... IPro inanimate pronoun 0.58 There are four sentence types, and four different tokens per type. Examples: ANP:...the dog that the man kicked... INP:...the toy that the man kicked... APro:...the dog that he kicked... IPro:...the toy that he kicked...

KBLT data and design Label Head noun RC-subject Score ANP animate NP 0.23 ANP animate NP 0.19 ANP animate NP 0.14 ANP animate NP 0.14 INP inanimate NP 0.25.... APro animate pronoun 0.63.... IPro inanimate pronoun 0.58 Two-way ANOVA by item with head noun animacy and RC-subject type as factors. Dependent variable: score, represents average repetition accuracy of 48 kids (3 4 years of age).

Data: means and SDs of four groups Dependent Variable:score Animacy Subject Mean Std. Deviation N animate NP.172.045 4 pro.633.026 4 Total.402.249 8 inanimate NP.323.069 4 pro.625.091 4 Total.474.178 8 Total NP.247.097 8 pro.629.062 8 Total.438.212 16 Note: SDs not approximately equal (because data is streamlined): 2 (animate+np) (inanimate+pro) Preciser check: Levene s test (for equality of variance); for problems, consider data transformation, including trimming

Check normality of data Normal Q Q Plot: animate+full NP Sample Quantiles 0.25 0.30 0.35 0.40 1.0 0.5 0.0 0.5 1.0 Theoretical Quantiles Check normality assumption for all groups! For problems, consider data transformation, including trimming

Multiple questions Two-way ANOVA asks two/three questions simultaneously: 1. Is head noun animacy affecting repetition accuracy? 2. Does lexical type of subject NP affect repetition accuracy? 3. Do the two effects interact, or are they independent? Questions 1 & 2 might have been asked in separate one-way ANOVA designs (but these would have been more costly in number of subjects) Question 3 is new to two-way ANOVA

Multiple null hypotheses in n-way ANOVA In our example: each of the two factors has two levels Factor A: Levels in A: Factor B: Levels in B: animacy of the head noun animate or inanimate lexical type of relative clause subject pronoun or full NP Three null hypotheses: 1. There is no difference in the means of factor animacy 2. There is no difference in the means of factor subject type 3. There is no interaction between factors animacy and subject type

Visualizing factorial ANOVA questions Question 1: Is head noun animacy affecting repetition accuracy? Repetition accuracy 0.2 0.3 0.4 0.5 0.6 0.7 animate inanimate Head noun animacy Little skew, similar medians, large overlap: probably not significant

Visualizing factorial ANOVA questions Question 2: Does lexical type of subject NP affect repetition accuracy? Repetition accuracy 0.2 0.3 0.4 0.5 0.6 0.7 NP pro Subject type Little skew, different medians, no overlap: very likely significant

Visualizing interaction Repetition accuracy (%) 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 Subject pro NP animate inanimate Head noun If no interaction, lines should be roughly parallel. It looks like inanimate head nouns facilitate the processing of object-relative clauses with full-np RC-subjects. Two-way ANOVA will measure this exactly.

Factorial ANOVA results Calculations compare mean group variance and mean individual variance as ANOVA SPSS terminology: betweensubjects F = MSG MSE between-subjects RC-subject Animate head Inanimate head full NP animate, NP inanimate, NP pronoun animate, pronoun inanimate, pronoun Invoke: General linear model Univariate fixed factors

Factorial ANOVA results Response: Perc Df Sum Sq Mean Sq F value Pr(>F) Animacy 1 0.02051 0.02051 5.2065 0.04154 * NP 1 0.58220 0.58220 147.7608 4.188e-08 *** Animacy:NP 1 0.02523 0.02523 6.4045 0.02639 * Residuals 12 0.04728 0.00394 Signif. codes: 0 *** 0.001 ** 0.01 * 0.05. 1. Animacy of the head noun has a significant effect on repetition accuracy of object-relative clauses (despite the boxplot earlier) 2. The type of RC-subject noun phrase has a profound effect on repetition accuracy 3. Significant interaction: the difference in repetition accuracy between object-relatives with pronominal and full-np subjects is significantly smaller when head nouns are inanimate.

Measuring effect size We found significant main effects for both factors, and an interaction effect. Note: because factors only have two levels here, no need to do post-hoc tests. Additional question: How large are the effects we found, i.e. how meaningful are the results? Effect size indicates the amount of variability in the dependent variable that can be accounted for by the independent variable. Note: effect size is not the same as the ANOVA p-value: Smaller p-value does not mean a bigger effect, because p depends on the sample size (as well as the effect size).

Measuring effect size Effect size for one-way ANOVA: η 2 = SSG SST ( eta-squared ) η 2 indicates proportion of variance in the dependent variable accounted for by differences between the levels of the factor. η 2 not suitable for n-way ANOVA because SST depends on presence of other factors! Effect size for two-way ANOVA: ηp 2 = SS A ( partial etasquared ) SS A +SSE This is easier to interpret: how much of the random variance was eliminated by this factor? In other words: from SSG we take the portion of the variance that can be attributed to factor A, and from SST we take that same portion plus the random within-groups variability.

Measuring effect size In our example: η 2 p = SS animacy SS animacy +SSE = 0.021 0.021+0.04 = 0.3 η 2 p = SS subject SS subject +SSE = 0.582 0.582+0.04 = 0.925 η 2 p = SS interaction SS interaction +SSE = 0.025 0.025+0.04 = 0.348 Rule of thumb: ηp 2 < 0.1 weak effect 0.1 ηp 2 < 0.6 medium-sized effect ηp 2 0.6 large effect

Fallbacks Factorial analysis of variance: normality violated Common wisdom until recently: no fallback

Fallbacks Factorial analysis of variance: normality violated Common wisdom until recently: no fallback But Field (2012:534) recommends R package pbad2way Your instructor hasn seen more on this! But it sounds right, using a permutation technique to assess significance.

Factorial analysis of variance Factorial analysis of variance: generalized t-test compares means compares groups along > 1 dimensions, e.g., school classes and gender assumes normal distributions, similar SDs in each group typical application: compare processing times for two syntactic structures under two phonological conditions (factorial design) compares variance among means vs. general variance (F -score) efficient in the use of subjects and experiment time allows (and forces!) attention to potential interaction

Factorial ANOVA: another perspective Recall that ANOVA seeks evidence for α i (in comparison of models): x ij = µ + ɛ ij x ij = µ + α i + ɛ ij Similarly, factorial ANOVA asks separately for significance of α i, β j, and interaction (αβ) ij, comparing models: x ij = µ + ɛ ij x ij = µ + α i + β j + (αβ) ij + ɛ ij

Factorial ANOVA models First model: no group effects x ij = µ + ɛ ij x ij = µ + α i + β j + (αβ) ij + ɛ ij each data point represents error (ɛ) around a mean (µ) Second model: real group effect(s) each data point represents error (ɛ) around an overall mean (µ), combined with one or two group adjustments (α i and β j ) possibly group effects involve interaction (αβ ij )

Next week Next week we will look at repeated measures ANOVA with one factor factorial ANOVA with one within-subjects factor