Example. χ 2 = Continued on the next page. All cells

Similar documents
Chapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables.

Chapter 10: Chi-Square and F Distributions

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.

11-2 Multinomial Experiment

Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution

STAT Chapter 13: Categorical Data. Recall we have studied binomial data, in which each trial falls into one of 2 categories (success/failure).

We know from STAT.1030 that the relevant test statistic for equality of proportions is:

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017

15: CHI SQUARED TESTS

Lecture 28 Chi-Square Analysis

Statistics for Managers Using Microsoft Excel

Chapter 26: Comparing Counts (Chi Square)

Lecture 41 Sections Mon, Apr 7, 2008

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Hypothesis testing. Data to decisions

STAT 135 Lab 6 Duality of Hypothesis Testing and Confidence Intervals, GLRT, Pearson χ 2 Tests and Q-Q plots. March 8, 2015

POLI 443 Applied Political Research

Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance ECON 509. Dr.

CHAPTER 9: HYPOTHESIS TESTING

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series

Lecture Slides. Section 13-1 Overview. Elementary Statistics Tenth Edition. Chapter 13 Nonparametric Statistics. by Mario F.

Inferences About Two Proportions

Chapter 8 Student Lecture Notes 8-1. Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance

Chi Square Analysis M&M Statistics. Name Period Date

Psych 230. Psychological Measurement and Statistics

STAT Chapter 8: Hypothesis Tests

Chapter 5 : Probability. Exercise Sheet. SHilal. 1 P a g e

Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

Chi-Squared Tests. Semester 1. Chi-Squared Tests

STAT 135 Lab 11 Tests for Categorical Data (Fisher s Exact test, χ 2 tests for Homogeneity and Independence) and Linear Regression

HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC

Statistics 3858 : Contingency Tables

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878

χ test statistics of 2.5? χ we see that: χ indicate agreement between the two sets of frequencies.

Statistical methods for comparing multiple groups. Lecture 7: ANOVA. ANOVA: Definition. ANOVA: Concepts

Chapter 3. Comparing two populations

HYPOTHESIS TESTING. Hypothesis Testing

MAT Mathematics in Today's World

16.3 One-Way ANOVA: The Procedure

Chapter 10. Prof. Tesler. Math 186 Winter χ 2 tests for goodness of fit and independence

Testing Independence

Basic Business Statistics, 10/e

Classroom Activity 7 Math 113 Name : 10 pts Intro to Applied Stats

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Probability Rules. MATH 130, Elements of Statistics I. J. Robert Buchanan. Fall Department of Mathematics

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Epidemiology Wonders of Biostatistics Chapter 11 (continued) - probability in a single population. John Koval

Mathematical Notation Math Introduction to Applied Statistics

Hypothesis Tests Solutions COR1-GB.1305 Statistics and Data Analysis

10.2: The Chi Square Test for Goodness of Fit

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Testing Research and Statistical Hypotheses

Hypothesis Tests and Estimation for Population Variances. Copyright 2014 Pearson Education, Inc.

Lecture 9. Selected material from: Ch. 12 The analysis of categorical data and goodness of fit tests

Inferences About Two Population Proportions

13.1 Categorical Data and the Multinomial Experiment

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

The probability of an event is viewed as a numerical measure of the chance that the event will occur.

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01

Quantitative Analysis and Empirical Methods

Sampling, Confidence Interval and Hypothesis Testing

Chi-square (χ 2 ) Tests

3 PROBABILITY TOPICS

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

IEOR 3106: Introduction to Operations Research: Stochastic Models. Professor Whitt. SOLUTIONS to Homework Assignment 1

Ling 289 Contingency Table Statistics

6.4 Type I and Type II Errors

Hypothesis Testing: Chi-Square Test 1

Marketing Research Session 10 Hypothesis Testing with Simple Random samples (Chapter 12)

Discrete Multivariate Statistics

:the actual population proportion are equal to the hypothesized sample proportions 2. H a

Math 2000 Practice Final Exam: Homework problems to review. Problem numbers

Outline Conditional Probability The Law of Total Probability and Bayes Theorem Independent Events. Week 4 Classical Probability, Part II

CIVL Why are we studying probability and statistics? Learning Objectives. Basic Laws and Axioms of Probability

There are statistical tests that compare prediction of a model with reality and measures how significant the difference.

ME3620. Theory of Engineering Experimentation. Spring Chapter IV. Decision Making for a Single Sample. Chapter IV

4. Suppose that we roll two die and let X be equal to the maximum of the two rolls. Find P (X {1, 3, 5}) and draw the PMF for X.

CHAPTER 3 PROBABILITY: EVENTS AND PROBABILITIES

2.3 Analysis of Categorical Data

Chapte The McGraw-Hill Companies, Inc. All rights reserved.

12.10 (STUDENT CD-ROM TOPIC) CHI-SQUARE GOODNESS- OF-FIT TESTS

Announcements. Lecture 5: Probability. Dangling threads from last week: Mean vs. median. Dangling threads from last week: Sampling bias

hypotheses. P-value Test for a 2 Sample z-test (Large Independent Samples) n > 30 P-value Test for a 2 Sample t-test (Small Samples) n < 30 Identify α

UNIT NUMBER PROBABILITY 6 (Statistics for the binomial distribution) A.J.Hobson

3.2 Probability Rules

Statistical Inference. Why Use Statistical Inference. Point Estimates. Point Estimates. Greg C Elvers

Chi-square (χ 2 ) Tests

Binary Logistic Regression

Ch. 11 Inference for Distributions of Categorical Data

Comparison of Bayesian and Frequentist Inference

Exam 2 Practice Questions, 18.05, Spring 2014

Lecture 41 Sections Wed, Nov 12, 2008

Statistics 135: Fall 2004 Final Exam

Categorical Data Analysis. The data are often just counts of how many things each category has.

Probability. Chapter 1 Probability. A Simple Example. Sample Space and Probability. Sample Space and Event. Sample Space (Two Dice) Probability

STAT 515 fa 2016 Lec Statistical inference - hypothesis testing

COVENANT UNIVERSITY NIGERIA TUTORIAL KIT OMEGA SEMESTER PROGRAMME: ECONOMICS

Quantitative Introduction ro Risk and Uncertainty in Business Module 5: Hypothesis Testing

Transcription:

Section 11.1 Chi Square Statistic k Categories 1 st 2 nd 3 rd k th Total Observed Frequencies O 1 O 2 O 3 O k n Expected Frequencies E 1 E 2 E 3 E k n O 1 + O 2 + O 3 + + O k = n E 1 + E 2 + E 3 + + E k = n Another word for category is cell. The above table consists of k categories. The first row contains the observed (actual) frequencies of k categories. The second row consists of the expected (theoretical) frequencies of k categories. We want to determine if the observed and expected frequencies agree with each other. This is accomplished with hypothesis tests that employ the χ 2 statistic. Example χ 2 = All cells ( 0 E) 2 Suppose that you want to test the fairness of a six sided die. You roll it 60 times. How many times do you expect the faces 1, 2, 3, 4, 5, and 6 to appear? Since the theoretical probability of rolling any one number of a die is 1/6, we would expect any one number to appear (1/6)(60) = 10 times out of 60 rolls. That is E 1 = 10, E 2 = 10, E 3 = 10, E 4 = 10, E 5 = 10 and E 6 = 10. Suppose that you want to test the fairness of a coin. You toss it times. How many times do you expect heads and tails to appear? Since the theoretical probability of tossing a head or a tail is ½, we would expect a head to appear (1/2)() = 50 and a tail to appear (1/2)() = 50. Hence, E 1 = 50 and E 2 = 50. E 1

Section 11.2 Inferences Concerning Multinomial Experiments Multinomial Experiment An experiment with the following characteristics: 1. It consists of n repeated (identical) independent trials. 2. The outcome of each trial fits into exactly on of k possible cells. 3. There is a probability associated with each particular cell, and these individual probabilities remain constant during the experiment. (It must be true that p 1 + p 2 + + p k = 1.) 4. The experiment will result in a set of k observed frequencies, O 1, O 2, O k, where each O i is the number of times a trial outcome falls into that particular cell. (It must be the case that O 1 + O 2 + + O k = n.) Elementary Statistics, Eighth Edition, Robert Johnson Patricia Kuby, pp. 544. ( 0 E) 2 The hypothesis tests that we conduct use the χ 2 =, where the All cells E degrees of freedom, df = k 1, where k is the number of categories. It is also very important to remember that whenever we conduct a multinomial experiment, it will be a right tail test. The hypothesis statements for multinomial tests follow. H o : p 1 = p 0,1, p 2 = p 0,2,, p k = p 0,k Vs. H a : At least one inequality exists. Examples A certain type of flower seed will produce magenta, chartreuse, and ochre flowers in the ratio 6 : 3 : 1 (one flower per seed). A total of seeds are planted and all germinate, yielding the following results. Magenta Chartreuse Ochre 52 36 12 a) If the null hypothesis is true, what is the expected number of magenta, chartreuse, and ochre flowers? seeds are planted. The expected frequency of colors is 6 : 3 : 1. 6 + 3 + 1 = 10. The expected number of magenta is (6/10)() = 60. 2

The expected number of chartreuse is (3/10)() = 30. The expected number of ochre is (1/10)() = 10. A table of expected vs. observed follows. Magenta Chartreuse Ochre Observed 52 36 12 Expected 60 30 10 b) How many degrees of freedom are associated with χ 2? Degrees of freedom, df = k 1. Since there are three categories, df = 3 1 = 2. c) Complete the hypothesis test using α =.10. State the hypotheses, and identify the claim. H o : p 1 =.6, p 2 =.3, p 3 =.1 (claim) vs. H a : At least one inequality exists. Let s first obtain χ 2. ( 0 E) ( 52 60) ( 36 30) ( 12 10) 2 2 2 2 χ 2 = = + + = 2.667 All cells E 60 30 10 Now let s get the p value. Remember that this is a right tail test, so that will affect the way we obtain our p value. p value = χ 2 cdf(2.667, E99, 2) =.264 The p value is described in the following diagram. p value =.264 χ 2 = 2.667 Decision: Fail to reject H o. Conclusion: There is enough evidence to support the claim that the hypothesized frequencies of flower colors are correct. The traditional method of hypothesis testing follows on the next page. 3

Find the critical value for this test using the INVCHI program. χ 2 (2,.10) = 4.60517. Now draw the picture. α =.10 χ 2 = 2.667 χ 2 (2,.10) = 4.60517 Clearly, χ 2 is outside of the rejection region. Hence, our decision to fail to reject H o is supported. To obtain χ 2 using your calculator, enter the O cells in L1 and the E cells in L2. Go to L3, hit the UP arrow to highlight L3, then hit ENTER. Now type in the following: (L1 L2) 2 /L2 ENTER. Hit Stat, go to Calc, hit ENTER once. Then L3 ENTER. Your calculator output should look like this. x =.8888888889 s x x = 2.666666667 2 = 2.73777778 =.4286067005 σ =.3499559055 x x n = 3 χ 2 = x = 2.666666667 4

Section 11.3 Inferences Concerning Contingency Tables A contingency table is an arrangement of data into a two way classification. The data are sorted into cells, and the number of data in each cell is reported. Elementary Statistics, Eighth Edition, Robert Johnson Patricia Kuby, pp. 553. Contingency tables are useful for tests of independence, and tests of homogeneity. We will be using the χ 2 distribution to conduct these tests. Test of Independence Example A survey of randomly selected travelers who visited the service station restrooms of a large U.S. petroleum distributor showed the following results. Quality of Restroom Facilities (Observed) Gender of Respondent Above Average Average Below Average Totals Female 7 24 28 59 Male 8 26 7 41 Totals 15 50 35 Using α =.05, does the sample present sufficient evidence to reject the hypothesis, Quality of responses is independent of the gender of the respondent? State the hypotheses. H o : The quality of responses is independent of the gender of the respondent. Vs. H a : The quality of responses is dependent of the gender of the respondent. Obtain the test statistic, χ 2. To do this, we must first obtain a table of expected frequencies. Recall that if two events are independent, then the probability of their intersection is equal to the product of their individual probabilities. Assuming that gender and response are independent, we obtain the EXPECTED FREQUENCY contingency table using the column and row totals of the OBSERVED FREQUENCY contingency table as follows. Row Total Column Total Row Total Column Total Expected Frequency = n = n n n 5

Quality of Restroom Facilities (Expected) Gender of Respondent Above Average Average Below Average Female 59 15 59 50 59 35 = 8.85 = 29.5 = 20.65 Male 41 15 41 50 41 35 = 6.15 = 20.5 = 14.35 χ 2 Recall that = All cells ( 0 E) ( 7 8.85) ( 24 29.5) ( 28 20.65) E 2 2 2 2 = + + 8.85 29.5 20.65 ( 8 6.15) ( 26 20.5) ( 7 14.35) 2 2 2 + + + = 9.825 6.15 20.5 14.35 For a two way contingency table, the degrees of freedom are equal to df = (number of rows 1)(number of columns 1) =(r 1 )(c 1) = 1(2) = 2. For a two way contingency table we will always conduct a right tail test. Let s find the p value. p value = χ 2 cdf(9.825, E99, 2) =.00735. The p value is described in the following diagram. p value =.00735 Decision: Reject H o. χ 2 = 9.825 6

We perform the same hypothesis test, but in the traditional manner. Find the critical value, χ 2 (df,.05) = χ 2 (2,.05) = 5.99146. α =.05 χ 2 (2,.05) = 5.99146 χ 2 = 9.825 Clearly, χ 2 is in the rejection region. Hence, we reject H o. Test of Homogeneity A test of homogeneity is used when the experimenter controls one of the two variables so that the row (or column) totals are predetermined. Elementary Statistics, Eighth Edition, Robert Johnson Patricia Kuby, pp. 557. The hypotheses for a test of homogeneity follow. H o : The distribution of proportions in row (or column) 1 is the same as in row (or column) 2 is the same as in row (or column) k. Vs. H a : The distribution of proportions in the rows (or columns) is not the same. These are the only differences between a test of independence and a test of homogeneity. KNOW HOW TO DISTINGUISH BETWEEN THE TWO TESTS! 7

Example A study of the Harvard Business School conducted in 1998 (Fortune, Tales of the Trailblazers, 10/12/98) concentrated on the career paths and life styles of the women who were graduates of the program. The focus was on the class of 1973 and the class of 1983. The class of 1973 was the first to include a solid number of women. But did things change ten years later? Consider the following table. Harvard Business School Class Men Women Total 1973 742 34 776 1983 538 189 727 Total 1280 223 1503 At the.01 level of significance, did the distribution of men and women who completed the program significantly change between 1973 and 1983? Clearly this is a test of homogeneity. Which totals were predetermined? The total number of people graduating from the classes is the answer to this question. State the hypotheses. H o : The proportions of men and women graduating from the Harvard School of Business are the same for the years 1973 and 1983. Vs. H a : The proportions of men and women graduating from the Harvard School of Business are not the same for the years 1973 and 1983. Obtain the expected frequency table. Harvard Business Men School Class 1973 776 1280 = 660.865 1503 1983 727 1280 = 619.135 1503 Women 776 223 = 115.135 1503 727 223 = 107.865 1503 8

2 Recall that χ = All cells ( 0 E) ( 742 660.865) ( 34 115.135) E 2 2 2 = + 660.865 115.135 2 2 ( 538 619.135) ( 189 107.865 ) + + = 138.798 619.135 107.865 Degrees of freedom: df = (number of rows 1)(number of columns 1) = (r 1)(c 1) = (2 1)(2 1) = 1. Remembering that we perform a right tail test, p value = χ 2 cdf(138.798, E99, 1) = 4.88 10 32. Decision: Reject H o. The proportions are not the same in the years 1973 and 1983. It appears that women are making some headway! A picture of the p value follows. p value = 4.88 10 32 χ 2 = 138.798 9

Let s conduct this test the traditional way. Find the critical value, χ 2 (1,.01). χ 2 (1,.01) = 6.63490. Draw the picture of the rejection region, and determine where χ 2 resides. α =.01 χ 2 (1,.01) = 6.63490 χ 2 = 138.798 The critical is WAY INSIDE the rejection region, so this supports our p value decision to reject H o. 10