Research Methods II MICHAEL BERNSTEIN CS 376

Similar documents
Chapter 11 - Lecture 1 Single Factor ANOVA

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

CHAPTER 4 Analysis of Variance. One-way ANOVA Two-way ANOVA i) Two way ANOVA without replication ii) Two way ANOVA with replication

n i n T Note: You can use the fact that t(.975; 10) = 2.228, t(.95; 10) = 1.813, t(.975; 12) = 2.179, t(.95; 12) =

Much of the material we will be covering for a while has to do with designing an experimental study that concerns some phenomenon of interest.

Solutions to Final STAT 421, Fall 2008

DESAIN EKSPERIMEN Analysis of Variances (ANOVA) Semester Genap 2017/2018 Jurusan Teknik Industri Universitas Brawijaya

Design & Analysis of Experiments 7E 2009 Montgomery

Multiple comparisons - subsequent inferences for two-way ANOVA

Chapter 10: Analysis of variance (ANOVA)

Econ 3790: Business and Economic Statistics. Instructor: Yogesh Uppal

Theorem A: Expectations of Sums of Squares Under the two-way ANOVA model, E(X i X) 2 = (µ i µ) 2 + n 1 n σ2

STAT 705 Chapter 16: One-way ANOVA

10 One-way analysis of variance (ANOVA)

Lec 1: An Introduction to ANOVA

Battery Life. Factory

16.3 One-Way ANOVA: The Procedure

Analysis of Variance

ANOVA: Comparing More Than Two Means

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1

iron retention (log) high Fe2+ medium Fe2+ high Fe3+ medium Fe3+ low Fe2+ low Fe3+ 2 Two-way ANOVA

QUEEN MARY, UNIVERSITY OF LONDON

4.1. Introduction: Comparing Means

Statistics for Managers Using Microsoft Excel Chapter 10 ANOVA and Other C-Sample Tests With Numerical Data

22s:152 Applied Linear Regression. Take random samples from each of m populations.

CS 5014: Research Methods in Computer Science

Introduction to the Analysis of Variance (ANOVA) Computing One-Way Independent Measures (Between Subjects) ANOVAs

22s:152 Applied Linear Regression. There are a couple commonly used models for a one-way ANOVA with m groups. Chapter 8: ANOVA

Week 14 Comparing k(> 2) Populations

Lecture notes 13: ANOVA (a.k.a. Analysis of Variance)

Chapter 12. Analysis of variance

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)

Sociology 6Z03 Review II

1 Introduction to One-way ANOVA

Extending the Robust Means Modeling Framework. Alyssa Counsell, Phil Chalmers, Matt Sigal, Rob Cribbie

EX1. One way ANOVA: miles versus Plug. a) What are the hypotheses to be tested? b) What are df 1 and df 2? Verify by hand. , y 3

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

STA 302 H1F / 1001 HF Fall 2007 Test 1 October 24, 2007

The legacy of Sir Ronald A. Fisher. Fisher s three fundamental principles: local control, replication, and randomization.

Chapter 11 - Lecture 1 Single Factor ANOVA

Section 4.6 Simple Linear Regression

Formal Statement of Simple Linear Regression Model

One-way Analysis of Variance. Major Points. T-test. Ψ320 Ainsworth

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis

Unit 12: Analysis of Single Factor Experiments

Statistiek II. John Nerbonne using reworkings by Hartmut Fitz and Wilbert Heeringa. February 13, Dept of Information Science

Comparisons among means (or, the analysis of factor effects)

STAT 263/363: Experimental Design Winter 2016/17. Lecture 1 January 9. Why perform Design of Experiments (DOE)? There are at least two reasons:

Analysis of Variance

Remedial Measures, Brown-Forsythe test, F test

Tukey Complete Pairwise Post-Hoc Comparison

6. Multiple Linear Regression

STAT 430 (Fall 2017): Tutorial 8

STATS Analysis of variance: ANOVA

One-way ANOVA (Single-Factor CRD)

ANOVA Analysis of Variance

Factorial designs. Experiments

Outline Topic 21 - Two Factor ANOVA

Two-Sample Inferential Statistics

One-Way Analysis of Variance (ANOVA)

ANOVA (Analysis of Variance) output RLS 11/20/2016

MAT3378 ANOVA Summary

Analysis of Variance. ภาว น ศ ร ประภาน ก ล คณะเศรษฐศาสตร มหาว ทยาล ยธรรมศาสตร

Chap The McGraw-Hill Companies, Inc. All rights reserved.

I i=1 1 I(J 1) j=1 (Y ij Ȳi ) 2. j=1 (Y j Ȳ )2 ] = 2n( is the two-sample t-test statistic.

DETAILED CONTENTS PART I INTRODUCTION AND DESCRIPTIVE STATISTICS. 1. Introduction to Statistics

Ch. 5 Two-way ANOVA: Fixed effect model Equal sample sizes

Chapter 10. Design of Experiments and Analysis of Variance

3. Diagnostics and Remedial Measures

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018

Model II (or random effects) one-way ANOVA:

Statistik för bioteknik sf2911 Föreläsning 15: Variansanalys

Josh Engwer (TTU) 1-Factor ANOVA / 32

Lectures on Simple Linear Regression Stat 431, Summer 2012

STAT Chapter 10: Analysis of Variance

3. Design Experiments and Variance Analysis

Chapter 16. Simple Linear Regression and dcorrelation

Inference for the Regression Coefficient

Ch 2: Simple Linear Regression

ANOVA Randomized Block Design

Multiple t Tests. Introduction to Analysis of Variance. Experiments with More than 2 Conditions

Two-factor studies. STAT 525 Chapter 19 and 20. Professor Olga Vitek

Example: Four levels of herbicide strength in an experiment on dry weight of treated plants.

Unit 27 One-Way Analysis of Variance

What Is ANOVA? Comparing Groups. One-way ANOVA. One way ANOVA (the F ratio test)

Confidence Intervals, Testing and ANOVA Summary

STA2601. Tutorial letter 203/2/2017. Applied Statistics II. Semester 2. Department of Statistics STA2601/203/2/2017. Solutions to Assignment 03

Comparing Several Means: ANOVA

3. Nonparametric methods

21.0 Two-Factor Designs

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

COMPARING SEVERAL MEANS: ANOVA

Diagnostics and Remedial Measures

Analysis of Variance (ANOVA)

STAT 705 Chapter 19: Two-way ANOVA

STAT 705 Chapter 19: Two-way ANOVA

THE ROYAL STATISTICAL SOCIETY HIGHER CERTIFICATE

Comparing Several Means

Factorial Treatment Structure: Part I. Lukas Meier, Seminar für Statistik

Lecture 18 MA Applied Statistics II D 2004

Transcription:

Research Methods II MICHAEL BERNSTEIN CS 376

Goal Understand and use statistical techniques common to HCI research 2

Last time How to plan an evaluation What is a statistical test? Chi-square t-test Paired t-test Mann-Whitney U 3

Today ANOVA Posthoc tests Two-way ANOVA Repeated measures ANOVA 4

ANOVA

t-test: compare two means Do people fix more bugs with our IDE bug suggestion callouts? 6

ANOVA: compare N means Do people fix more bugs with our IDE bug suggestion callouts, with warnings, or with nothing? 7

Cell means model Assume there are r factor levels e.g., laptop + tablet + phone: r=3 Value of the jth observation for the ith factor level: Y ij e.g., Y2,5 is the i=2nd condition and the j=5th user 8

Cell means model ANOVA characterizes each observation as a deviation from the mean of the factor level 9

Cell means model Starter ANOVA model: Y ij = µ i + ij mean for factor level i error: difference between observed value and the mean Yij are independent N(µ i, 2 ) ij µ i Y ij 10

Partitioning the variance The total variability in Y is the difference between each observation Yij and the grand mean Ȳ.. bar is the mean; dot is an aggregate over all observations, here both i and j Easier to understand if we separate it out via the factor level means Y ij Ȳ = Ȳi Ȳ + Y ij Ȳ i total deviation from grand mean deviation of factor mean from grand mean deviation of response from factor mean 11

Ȳ 2 Ȳ 2 Ȳ Ȳ Ȳ Ȳ 1 1 Y ij Ȳ = Ȳi Ȳ + Y ij Ȳ i total deviation from grand mean deviation of factor mean from grand mean deviation of response from factor mean 12

Partitioning the variance Total sum of squares SSTO: SSTO = X i X (Y ij Ȳ ) 2 j Treatment sum of squares SSTR: SSTR = X i n i (Ȳi Ȳ ) 2 Error sum of squares SSE: X SSE = X i (Y ij Ȳ i ) 2 j 13

ANalysis Of VAriance (ANOVA) Provably true: SSTO = SSTR + SSE total variance Degrees of freedom: how many values can vary? (Using n and r) SSTO: n - 1 SSTR: r - 1 differences between factor level means random variation around factor level means SSE: n - r 14

Studentizing the variance Divide each estimator by its degrees of freedom to produce a 2 random variable: Treatment mean square is 2 (r-1) MSTR = SSTR r 1 Error mean square is 2 (n-r) MSE = SSE n r 15

Turning variance into a statistic Null hypothesis: µ 1 = µ 2 =...= µ r Alternate hypothesis: not all µ i are equal Statistics magic: dividing two random variables distributed 2 as produces a random variable distributed as F F = MSTR is MSE F (r 1,n r) Large MSTR relative to MSE suggests that the factor means explain most variance 16

Finally: run the test! How large is the value we constructed from the F distribution? Test if F >F(1 ; r 1,n r) SSTR SSE 3 factor levels SS MS F(2,21) p <.001 24 observations 17

Posthoc tests

We re done...or are we? Significant means One of the µ i are different. That s not very helpful: There is some difference between populating the Facebook news feed with friends vs. strangers vs. only Michael s status updates 19

Estimating pairwise differences Which pairs of factor levels are different from each other? 90.0 67.5 Mean likes 45.0 22.5 0.0 Friend feed Stranger feed Michael feed 20

Roughly: we do pairwise t-tests 90.0 Mean likes 67.5 45.0 t >t(1 ; n r) t >t(1 ; n r) t >t(1 ; n r) 22.5 0.0 Friend feed Stranger feed Michael feed 21

But familywise error! =.05 implies a.95 probability of being correct If we do m tests, the actual probability of being correct is now: m =.95.95.95... <.95 22

Bonferroni correction Avoid familywise error by adjusting to be more conservative Divide by the number of comparisons you make 4 tests at =.05 implies using =.0125 Conservative but accurate method of compensating for multiple tests 23

Bonferroni correction 24

Reporting an ANOVA A one-way ANOVA revealed a significant difference in the effect of news feed source on number of likes (F(2, 21)=12.1, p<.001). Posthoc tests using Bonferroni correction revealed that friend feed and Michael feed were significantly better than a stranger feed (p<.05), but the two were not significantly different from each other (p=.32). 25

Two-way ANOVA

Crossed study designs Suppose you wanted to measure the impact of two factors on total likes on Facebook: Strong ties vs. weak ties in your news feed Presence of a reminder of the last time you liked each friend s content (e.g., You last liked a story from John Hennessy in January ) This is a 2 x 2 study: two factor levels for each factor {tie strength, reminder} 27

Basic two-factor ANOVA model µ ij = µ + i + j mean for ith level of 1st factor & jth level of 2nd factor grand mean difference between ith level of 1st factor and grand mean difference between jth level of 2nd factor and grand mean 28

µ ij = µ + i + j! Example: µ 1,2 µ =8 Mean user has 8 likes: Mean user with strong ties (i=1) has 11 likes: 1 = µ i µ = 11 8=3 Mean user with reminder has 7 likes: 2 = µ j µ =7 8= 1 29

Interaction effects Sometimes the basic model doesn t capture subtle interactions between factors Data: People who see strong ties and have a reminder are especially active Result: Grand mean 8, strong tie mean 11, reminder mean 7, but mean in this cell is 20 30

Two-factor ANOVA test Test for main effects and interaction factor or interaction SS MS F p Main effects are significant, but interaction effect is also significant 31

Repeated measures ANOVA

Within-subjects studies Control for individual variation using the mean response for each participant Before: we found the mean effect of each treatment Now: we find the mean effect of each participant 33

Repeated measures in R repeated measures error term effect of subtracting out the participant means remaining main effects 34

All together now

Always follow every step! 1. Visualize the data 2. Compute descriptive statistics (e.g., mean) 3. Remove outliers >2 standard deviations from the mean 4. Check for heteroskedasticity and non-normal data Try log, square root, or reciprocal transform ANOVA is robust against non-normal data, but not against heteroskedasticity 5. Run statistical test 6. Run any posthoc tests if necessary 36