Lecture (chapter 10): Hypothesis testing III: The analysis of variance

Similar documents
Lecture (chapter 13): Association between variables measured at the interval-ratio level

Lecture 24: Partial correlation, multiple regression, and correlation

One-Way Analysis of Variance. With regression, we related two quantitative, typically continuous variables.

Study Guide #3: OneWay ANALYSIS OF VARIANCE (ANOVA)

Unit 27 One-Way Analysis of Variance

Statistical methods for comparing multiple groups. Lecture 7: ANOVA. ANOVA: Definition. ANOVA: Concepts

Data Analysis and Statistical Methods Statistics 651

Analysis of Variance: Part 1

Disadvantages of using many pooled t procedures. The sampling distribution of the sample means. The variability between the sample means

Chapter 8 Student Lecture Notes 8-1. Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance

LECTURE 6. Introduction to Econometrics. Hypothesis testing & Goodness of fit

Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance ECON 509. Dr.

One-factor analysis of variance (ANOVA)

One-Way Analysis of Variance: A Guide to Testing Differences Between Multiple Groups

Introduction to Business Statistics QM 220 Chapter 12

2 Hand-out 2. Dr. M. P. M. M. M c Loughlin Revised 2018

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

The t-test: A z-score for a sample mean tells us where in the distribution the particular mean lies

16.400/453J Human Factors Engineering. Design of Experiments II

1. What does the alternate hypothesis ask for a one-way between-subjects analysis of variance?

PSY 216. Assignment 9 Answers. Under what circumstances is a t statistic used instead of a z-score for a hypothesis test

Sampling Distributions: Central Limit Theorem

The One-Way Repeated-Measures ANOVA. (For Within-Subjects Designs)

Factorial designs. Experiments

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

Statistics For Economics & Business

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

Keppel, G. & Wickens, T.D. Design and Analysis Chapter 2: Sources of Variability and Sums of Squares

Black White Total Observed Expected χ 2 = (f observed f expected ) 2 f expected (83 126) 2 ( )2 126

What is a Hypothesis?

Inferences for Regression

POLI 443 Applied Political Research

Institute of Actuaries of India

Chapter 7 Comparison of two independent samples

Lecture 7: Hypothesis Testing and ANOVA

Lecture 5: ANOVA and Correlation

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

10/31/2012. One-Way ANOVA F-test

4.1. Introduction: Comparing Means

Notes for Week 13 Analysis of Variance (ANOVA) continued WEEK 13 page 1

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

PSYC 331 STATISTICS FOR PSYCHOLOGISTS

CHAPTER 10 ONE-WAY ANALYSIS OF VARIANCE. It would be very unusual for all the research one might conduct to be restricted to

2 and F Distributions. Barrow, Statistics for Economics, Accounting and Business Studies, 4 th edition Pearson Education Limited 2006

Table 1: Fish Biomass data set on 26 streams

In a one-way ANOVA, the total sums of squares among observations is partitioned into two components: Sums of squares represent:

The Chi-Square Distributions

LECTURE 5. Introduction to Econometrics. Hypothesis testing

Hypothesis testing: Steps

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

One-way ANOVA. Experimental Design. One-way ANOVA

The Chi-Square Distributions

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

Draft Proof - Do not copy, post, or distribute. Chapter Learning Objectives REGRESSION AND CORRELATION THE SCATTER DIAGRAM

Quantitative Analysis and Empirical Methods

MGEC11H3Y L01 Introduction to Regression Analysis Term Test Friday July 5, PM Instructor: Victor Yu

Final Exam - Solutions

Hypothesis testing: Steps

Using SPSS for One Way Analysis of Variance

2) For a normal distribution, the skewness and kurtosis measures are as follows: A) 1.96 and 4 B) 1 and 2 C) 0 and 3 D) 0 and 0

Comparing Means from Two-Sample

Lecture 14: ANOVA and the F-test

Two-Sample Inferential Statistics

An inferential procedure to use sample data to understand a population Procedures

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Statistics for Managers Using Microsoft Excel

Ch 2: Simple Linear Regression

Lectures 5 & 6: Hypothesis Testing

Introduction to the Analysis of Variance (ANOVA)

(ii) Scan your answer sheets INTO ONE FILE only, and submit it in the drop-box.

Chapter 12 - Lecture 2 Inferences about regression coefficient

HYPOTHESIS TESTING. Hypothesis Testing

The One-Way Independent-Samples ANOVA. (For Between-Subjects Designs)

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests

While you wait: Enter the following in your calculator. Find the mean and sample variation of each group. Bluman, Chapter 12 1

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Analysis of Variance (ANOVA)

ECO220Y Simple Regression: Testing the Slope

Chapter Seven: Multi-Sample Methods 1/52

Difference in two or more average scores in different groups

Psych 230. Psychological Measurement and Statistics

Chapter 10: Chi-Square and F Distributions

BIO5312 Biostatistics Lecture 6: Statistical hypothesis testings

Lecture 41 Sections Mon, Apr 7, 2008

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true

PHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1

Introduction to Analysis of Variance. Chapter 11

Hypothesis Testing. We normally talk about two types of hypothesis: the null hypothesis and the research or alternative hypothesis.

Pooled Variance t Test

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

ANOVA Analysis of Variance

Rama Nada. -Ensherah Mokheemer. 1 P a g e

Business Statistics. Lecture 10: Correlation and Linear Regression

Sleep data, two drugs Ch13.xls

Power Analysis. Introduction to Power

POLI 443 Applied Political Research

One-Way ANOVA Source Table J - 1 SS B / J - 1 MS B /MS W. Pairwise Post-Hoc Comparisons of Means

Analysis of Variance (ANOVA)

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

Transcription:

Lecture (chapter 10): Hypothesis testing III: The analysis of variance Ernesto F. L. Amaral March 19 21, 2018 Advanced Methods of Social Research (SOCI 420) Source: Healey, Joseph F. 2015. Statistics: A Tool for Social Research. Stamford: Cengage Learning. 10th edition. Chapter 10 (pp. 247 275).

Chapter learning objectives Identify and cite examples of situations in which analysis of variance (ANOVA) is appropriate Explain the logic of hypothesis testing as applied to ANOVA Perform the ANOVA test, using the five-step model as a guide, and correctly interpret the results Define and explain the concepts of population variance, total sum of squares, sum of squares between, sum of squares within, mean square estimates Explain the difference between the statistical significance and the importance (magnitude) of relationships between variables 2

ANOVA application ANOVA can be used in situations where the researcher is interested in the differences in sample means across three or more categories How do Protestants, Catholics, and Jews vary in terms of number of children? How do Republicans, Democrats, and Independents vary in terms of income? How do older, middle-aged, and younger people vary in terms of frequency of church attendance? 3

Extension of t-test We can think of ANOVA as an extension of t-test for more than two groups Are the differences between the samples large enough to reject the null hypothesis and justify the conclusion that the populations represented by the samples are different? Null hypothesis, H 0 H 0 : μ 1 = μ 2 = μ 3 = = μ k All population means are similar to each other Alternative hypothesis, H 1 At least one of the populations means is different 4

Logic of ANOVA Could there be a relationship between age and support for capital punishment? No difference between groups Difference between groups Source: Healey 2015, p.249. 5

Between and within differences If the H 0 is true, the sample means should be about the same value If the H 0 is true, there will be little difference between sample means If the H 0 is false There should be substantial differences between sample means (between categories) There should be relatively little difference within categories The sample standard deviations should be small within groups 6

Likelihood of rejecting H 0 The greater the difference between categories (as measured by the means) Relative to the differences within categories (as measured by the standard deviations) The more likely the H 0 can be rejected When we reject H 0 We are saying there are differences between the populations represented by the sample 7

Computation of ANOVA 1. Find total sum of squares (SST)!!" = $ % ' & ) *% ' 2. Find sum of squares between (SSB)!!+ = $ ), *%, *% ' SSB = sum of squares between categories ), = number of cases in a category *%, = mean of a category 3. Find sum of squares within (SSW) SSW = SST SSB 8

4. Degrees of freedom dfw = N k dfw = degrees of freedom within N = total number of cases k = number of categories dfb = k 1 dfb = degrees of freedom between k = number of categories 9

Final estimations 5. Find mean square estimates!"#$ %&'#(" )*+h*$ =../ 01)!"#$ %&'#(" 2"+)""$ =..3 012 6. Find the F ratio 4 52+#*$"0 =!"#$ %&'#(" 2"+)""$!"#$ %&'#(" )*+h*$ 10

Example Support for capital punishment Sample of 16 people who are equally divided across four age groups Source: Healey 2015, p.252. 11

Step 1: Assumptions,requirements Independent random samples Interval-ratio level of measurement Normally distributed populations Equal population variances 12

Step 2: Null hypothesis Null hypothesis, H 0 : μ 1 = μ 2 = μ 3 = μ 4 The null hypothesis asserts there is no difference between the populations Alternative hypothesis, H 1 At least one of the populations means is different 13

Step 3: Distribution, critical region Sampling distribution F distribution Significance level Alpha (α) = 0.05 Degrees of freedom dfw = N k = 16 4 = 12 dfb = k 1 = 4 1 = 3 Critical F F(critical) = 3.49 14

Step 4: Test statistic 1. Total sum of squares (SST)!!" = $ % & ' ) *% '!!" = 438 + 702 + 1058 + 1994 16 15.25 '!!" = 471.04 2. Sum of squares between (SSB)!!7 = $ ) 8 *% 8 *% ' SSB = 4(10 15.25) ' + 4(13 15.25) ' + 4(16 15.25) ' + 4(22 15.25) ' = 314.96 3. Sum of squares within (SSW) SSW = SST SSB = 471.04 314.96 = 156.08 15

4. Degrees of freedom dfw = N k = 16 4 = 12 dfb = k 1 = 4 1 = 3 5. Mean square estimates!"#$ %&'#(" )*+h*$ =../ 01) = 156.08 12!"#$ %&'#(" :"+)""$ =..; 01: = 314.96 3 = 13.00 = 104.99 6. F ratio >?:+#*$"0 =!"#$ %&'#(" :"+)""$!"#$ %&'#(" )*+h*$ = 104.99 13.00 = 8.08 16

Step 5: Decision, interpret F(obtained) = 8.08 This is beyond F(critical) = 3.49 The obtained test statistic falls in the critical region, so we reject the H 0 Support for capital punishment does differ across age groups 17

Example from 2016 GSS We know the average income by race/ethnicity. table raceeth [aweight=wtssall], c(mean conrinc sd conrinc n conrinc) Race/Ethn icity mean(conrinc) sd(conrinc) N(conrinc) White 38845.61946 39157.17 1,072 Black 23243.0413 19671.53 273 Hispanic 23128.91777 21406.31 215 Other 50156.34855 59219.9 72 Does at least one category of the race/ethnicity variable have average income different than the others? This is not a perfect example for ANOVA, because the race/ethnicity variable does not have equal numbers of cases across its categories 18

Example from GSS: Result The probability of not rejecting H 0 is small (p<0.01) At least one category of the race/ethnicity variable has average income different than the others with a 99% confidence level However, ANOVA does not inform which category has an average income significantly different than the others in 2016. oneway conrinc raceeth [aweight=wtssall] Analysis of Variance Source SS df MS F Prob > F Between groups 1.0142e+11 3 3.3806e+10 26.23 0.0000 Within groups 2.0980e+12 1628 1.2887e+09 Total 2.1994e+12 1631 1.3485e+09 Source: 2016 General Social Survey. 19

Edited table Table 1. One-way analysis of variance for individual average income of the U.S. adult population by race/ethnicity, 2004, 2010, and 2016 Source 2004 Sum of Squares Source: 2004, 2010, 2016 General Social Surveys. Degrees of Freedom Mean of Squares F-test Prob > F Between groups 5.92e+10 3 1.97e+10 16.36 0.0000 Within groups 2.03e+12 1,682 1.21e+09 Total 2.09e+12 1,685 1.24e+09 2010 Between groups 6.02e+10 3 2.01e+10 24.50 0.0000 Within groups 9.79e+11 1,195 819,590,864 Total 1.04e+12 1,198 867,818,893 2016 Between groups 1.01e+11 3 3.38e+10 26.23 0.0000 Within groups 2.10e+12 1,628 1.29e+09 Total 2.20e+12 1,631 1.35e+09 20

Limitations of ANOVA Requires interval-ratio level measurement of the dependent variable Requires roughly equal numbers of cases in the categories of the independent variable Statistically significant differences are not necessarily important (small magnitude) The alternative (research) hypothesis is not specific It only asserts that at least one of the population means differs from the others 21