Nominal Data. Parametric Statistics. Nonparametric Statistics. Parametric vs Nonparametric Tests. Greg C Elvers

Similar documents
Parametric versus Nonparametric Statistics-when to use them and which is more powerful? Dr Mahmoud Alhussami

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

The goodness-of-fit test Having discussed how to make comparisons between two proportions, we now consider comparisons of multiple proportions.

HYPOTHESIS TESTING: THE CHI-SQUARE STATISTIC

Psych 230. Psychological Measurement and Statistics

4/6/16. Non-parametric Test. Overview. Stephen Opiyo. Distinguish Parametric and Nonparametric Test Procedures

Introduction to Statistical Data Analysis Lecture 7: The Chi-Square Distribution

16.400/453J Human Factors Engineering. Design of Experiments II

Classroom Activity 7 Math 113 Name : 10 pts Intro to Applied Stats

We know from STAT.1030 that the relevant test statistic for equality of proportions is:

Statistics for Managers Using Microsoft Excel

Chapter 10. Chapter 10. Multinomial Experiments and. Multinomial Experiments and Contingency Tables. Contingency Tables.

BIO 682 Nonparametric Statistics Spring 2010

Chapter 9 Inferences from Two Samples

Inferential statistics

Basic Business Statistics, 10/e

CHI SQUARE ANALYSIS 8/18/2011 HYPOTHESIS TESTS SO FAR PARAMETRIC VS. NON-PARAMETRIC

Statistical Analysis for QBIC Genetics Adapted by Ellen G. Dow 2017

Lecture Slides. Elementary Statistics. by Mario F. Triola. and the Triola Statistics Series

Review for Final. Chapter 1 Type of studies: anecdotal, observational, experimental Random sampling

Lecture Slides. Section 13-1 Overview. Elementary Statistics Tenth Edition. Chapter 13 Nonparametric Statistics. by Mario F.

Testing Independence

Lab #12: Exam 3 Review Key

Relate Attributes and Counts

Nonparametric statistic methods. Waraphon Phimpraphai DVM, PhD Department of Veterinary Public Health

Using SPSS for One Way Analysis of Variance

PSY 307 Statistics for the Behavioral Sciences. Chapter 20 Tests for Ranked Data, Choosing Statistical Tests

Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance ECON 509. Dr.

The t-statistic. Student s t Test

Lecture 28 Chi-Square Analysis

Topic 21 Goodness of Fit

Introduction and Descriptive Statistics p. 1 Introduction to Statistics p. 3 Statistics, Science, and Observations p. 5 Populations and Samples p.

" M A #M B. Standard deviation of the population (Greek lowercase letter sigma) σ 2

CIVL /8904 T R A F F I C F L O W T H E O R Y L E C T U R E - 8

Psych Jan. 5, 2005

Two-Sample Inferential Statistics

11-2 Multinomial Experiment

Statistics for Business and Economics

Chapter 8 Student Lecture Notes 8-1. Department of Economics. Business Statistics. Chapter 12 Chi-square test of independence & Analysis of Variance

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01

Frequency Distribution Cross-Tabulation

PSY 216. Assignment 9 Answers. Under what circumstances is a t statistic used instead of a z-score for a hypothesis test

Multiple t Tests. Introduction to Analysis of Variance. Experiments with More than 2 Conditions

Chapter 11. Hypothesis Testing (II)

Retrieve and Open the Data

psychological statistics

Application of Variance Homogeneity Tests Under Violation of Normality Assumption

Lecture 9. Selected material from: Ch. 12 The analysis of categorical data and goodness of fit tests

Transition Passage to Descriptive Statistics 28

Chapter 15: Nonparametric Statistics Section 15.1: An Overview of Nonparametric Statistics

Non-parametric methods

Degrees of freedom df=1. Limitations OR in SPSS LIM: Knowing σ and µ is unlikely in large

Chapter 26: Comparing Counts (Chi Square)

Analysis of Variance: Part 1

Class Notes: Week 8. Probit versus Logit Link Functions and Count Data

10/4/2013. Hypothesis Testing & z-test. Hypothesis Testing. Hypothesis Testing

Inferences About the Difference Between Two Means

Ron Heck, Fall Week 3: Notes Building a Two-Level Model

41.2. Tests Concerning a Single Sample. Introduction. Prerequisites. Learning Outcomes

Mathematical Notation Math Introduction to Applied Statistics

13.1 Categorical Data and the Multinomial Experiment

STP 226 ELEMENTARY STATISTICS NOTES

PSYC 331 STATISTICS FOR PSYCHOLOGISTS

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests

Binary Logistic Regression

Section 4.6 Simple Linear Regression

Hypothesis Testing hypothesis testing approach

Psychology 282 Lecture #4 Outline Inferences in SLR

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

Introduction to Business Statistics QM 220 Chapter 12

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Lecture 41 Sections Mon, Apr 7, 2008

The Chi-Square Distributions

Rama Nada. -Ensherah Mokheemer. 1 P a g e

Nonparametric Statistics

8.1-4 Test of Hypotheses Based on a Single Sample

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

Chapter 12. Analysis of variance

Quantitative Analysis and Empirical Methods

Class 24. Daniel B. Rowe, Ph.D. Department of Mathematics, Statistics, and Computer Science. Marquette University MATH 1700

Example. χ 2 = Continued on the next page. All cells

Introducing Generalized Linear Models: Logistic Regression

Ch. 7. One sample hypothesis tests for µ and σ

Chapter 8. The analysis of count data. This is page 236 Printer: Opaque this

Statistics in medicine

Independent Samples ANOVA

Selection of Variable Selecting the right variable for a control chart means understanding the difference between discrete and continuous data.

12.10 (STUDENT CD-ROM TOPIC) CHI-SQUARE GOODNESS- OF-FIT TESTS

Ron Heck, Fall Week 8: Introducing Generalized Linear Models: Logistic Regression 1 (Replaces prior revision dated October 20, 2011)

A SHORT INTRODUCTION TO PROBABILITY

Contents. Acknowledgments. xix

NON-PARAMETRIC STATISTICS * (

Advanced Experimental Design

Final Exam. Name: Solution:

Categorical Data Analysis. The data are often just counts of how many things each category has.

Statistics: revision

STP 226 EXAMPLE EXAM #3 INSTRUCTOR:

Statistics 135 Fall 2008 Final Exam

MATH Notebook 3 Spring 2018

Transcription:

Nominal Data Greg C Elvers 1 Parametric Statistics The inferential statistics that we have discussed, such as t and ANOVA, are parametric statistics A parametric statistic is a statistic that makes certain assumptions about how the data are distributed Typically, they assume that the data are distributed normally Nonparametric Statistics Nonparametric statistics do not make assumptions about the underlying distribution of the data Thus, nonparametric statistics are useful when the data are not normally distributed Because nominally scaled variables cannot be normally distributed, nonparametric statistics should be used with them 3 Parametric vs Nonparametric Tests When you have a choice, you should use parametric statistics because they have greater statistical power than the corresponding nonparametric tests That is, parametric statistics are more likely to correctly reject H 0 than nonparametric statistics 4

The binomial test is a type of nonparametric statistic The binomial test is used when the DV is nominal, and it has only two categories or classes It is used to answer the question: In a sample, is the proportion of observations in one category different than a given proportion? 5 A researcher wants to know if the proportion of ailurophiles in a group of 0 librarians is greater than that found in the general population,.40 There are 9 ailurophiles in the group of 0 librarians 6 Write H 0 and H 1 : H 0 : P.40 H 1 : P >.40 Is the hypothesis one-tailed or two-tailed? Directional, one-tailed Determine the statistical test The librarians can either be or not be ailurophiles, thus we have a dichotomous, nominally scaled variable Use the binomial test 7 Determine the critical value from a table of critical binomial values Find the column that corresponds to the p value (in this case.40) Find the row that corresponds to the sample size (N 0) and α (.05) The critical value is 13 8

If the observed number of ailurophiles (9) is greater than or equal to the critical value (13), you can reject H 0 We fail to reject H 0 ; there is insufficient evidence to conclude that the percentage of librarians who are ailurophiles is probably greater than that of the general population 9 When the sample size is greater than or equal to 50, then a normal approximation (i.e. a z-test) can be used in place of the binomial test When the product of the sample size (N), p, and 1 - p is greater than or equal to 9, then the normal approximation can be use 10 The normal approximation to the binomial test is defined as: z x NP NP 1 ( P) x number of observations in the category N sample size P probability in question 11 A researcher wants to know if the proportion of ailurophiles in a group of 100 librarians is greater than that found in the general population,.40 There are 43 ailurophiles in the group of 100 librarians 1

Write H 0 and H 1 : H 0 : P.40 H 1 : P >.40 Is the hypothesis one-tailed or two-tailed? Directional, one-tailed Determine the statistical test The librarians can either be or not be ailurophiles, thus we have a dichotomous, nominally scaled variable Use the z test, because n 50 13 Calculate the z-score z x NP NP 1 3 4.899 0.61 ( P) 43 100.40 100.40 ( 1.40) 14 Determine the critical value from a table of area under the normal curve Find the z-score that corresponds to an area of.05 above the z-score That value is 1.65 Compare the calculated z-score to the critical z-score If z calculated z critical, then reject H 0 0.61 < 1.65; fail to reject H 0 15 χ -- One Variable When you have nominal data that has more than two categories, the binomial test is not appropriate The χ (chi squared) test is appropriate in such instances The χ test answers the following question: Is the observed number of items in each category different from a theoretically expected number of observations in the categories? 16

χ -- One Variable At a recent GRE test, each of 8 students took one of 5 subject tests Was there an equal number of test takers for each test? Test Psych Math Bio Lit Engin Obs. 1 4 6 4 Exp. 5.6 5.6 5.6 5.6 5.6 17 χ -- One Variable Write H 0 and H 1 : H 0 : Σ(O - E) 0 H 1 : Σ(O - E) 0 O observed frequencies E expected frequencies Specify α α.05 Calculate the χ statistic χ Σ[(O i -E i ) /E i ] 18 χ Calculations Psy Math Bio Lit Engin O i 1 4 6 4 E i 5.6 5.6 5.6 5.6 5.6 O i -E i 6.4-3.6-1.6.4-1.6 (O i -E i ) 40.96 1.96.56 1.6.56 (O i -E i ) /E i 7.31.31 0.46 0.9 0.46 19 χ Calculations χ Σ[(O i -E i ) /E i ] χ 7.31+.31+0.46+0.9+0.4610.83 Calculate the degrees of freedom: df number of groups - 1 5-1 4 Determine the critical value from a table of critical χ values df 4, α.05 Critical χ α.05 (4) 9.488 0

χ Decision If the observed / calculated value of χ is greater than or equal to the critical value of χ, then you can reject H 0 that there is no difference between the observed and expected frequencies Because the observed χ 10.83 is larger than the critical χ 9.488, we can reject H 0 that the observed and expected frequencies are the same 1 χ Test of Independence χ can also be used to determine if two variables are independent of each other E.g., is being an ailurophile independent of whether you are male or female? Write H 0 and H 1 : H 0 : ΣΣ(O - E) 0 H 1 : ΣΣ(O - E) 0 Specify α α.05 χ Test of Independence The procedure for answering such questions is virtually identical to the one variable χ procedure, except that we have no theoretical basis for the expected frequencies The expected frequencies are derived from the data 3 χ Test of Independence Male Female Total Ailurophile 4 37 61 Non-ailurophile 1 7 19 Total 36 44 80 The expected frequencies are given by the formula to the right: ric j Eij T E expected frequency for cell at row i and column j ij r total for row i i c total for column j j T total number of observations 4

Ailurophile χ Test of Independence Male Female Total O 114 O 137 E11(61*36) E1(61*44) r 161 /807.45 /8033.55 O 11 O 7 Non-ailurophile E1(19*36) E(19*44) r 19 /808.55 /8010.45 Total c 136 c 44 T80 5 χ χ Test of Independence Calculate the observed value of χ r c ( Oij Eij) i 1 j 1 ( 4 7.45) ( 37 33.55) ( 1 8.55) ( 7 10.45) + + 7.45 33.55 0.434 + 0.355 + 1.39 + 1.139 3.319 E ij 8.55 + 10.45 6 χ Test of Independence First, determine the degrees of freedom: df (r - 1) * ( c - 1) In this example, the number of rows (r) is, and the number of columns (c) is, so the degrees of freedom are ( - 1) * ( - 1) 1 Determine the critical value of χ from a table of critical χ values Critical χ α.05 (1)3.841 7 χ Test of Independence Make the decision If the observed /calculated value of χ is greater than or equal to the critical value of χ, then you can reject H 0 that the expected and observed frequencies are equal If this example, the observed χ 3.319 is not greater than or equal to the critical χ 3.841, so we fail to reject H 0 8

Requirements for the Use of χ Even though χ makes no assumptions about the underlying distribution, it does make some assumptions that needs to be met prior to use Assumption of independence Frequencies must be used, not percentages Sufficiently large sample size 9 Assumption of Independence Each observation must be unique; that is an individual cannot be contained in more than one category, or counted in one category more than once When this assumption is violated, the probability of making a Type-I error is greatly enhanced 30 Frequencies The data must correspond to frequencies in the categories; percentages are not appropriate as data 31 Sufficient Sample Size Different people have different recommendation about how large the sample should be, and what the minimum expected frequency in each cell should be Good, Grover, and Mitchell (1977) suggest that the expected frequencies can be as low as 0.33 without increasing the likelihood of making a Type-I error Small samples reduce power 3