Psych 230 Psychological Measurement and Statistics Pedro Wolf December 9, 2009
This Time. Non-Parametric statistics Chi-Square test One-way Two-way
Statistical Testing 1. Decide which test to use 2. State the hypotheses (H 0 and H 1 ) 3. Calculate the obtained value - calculate r 4. Calculate the critical value (size of α) 5. Make our conclusion
1. Decide which test to use Are we comparing a sample to a population? Yes: Z-test if we know the population standard deviation Yes: One-sample T-test if we do not know the population std dev No: Keep looking Are we looking for the difference between samples? Yes: How many samples are we comparing? Two: Use the Two-sample T-test Are the samples independent or related?» Independent: Use Independent Samples T-test» Related: Use Related Samples T-test More than Two Groups: Keep looking How many independent variables One: Use a One-way ANOVA Two: Use a Two-factor ANOVA No: Keep looking Are we looking for the relationship between variables? Yes: Use the Correlation test Are we examining the frequencies of categorical, mutually exclusive classes? Yes: How many variables are we examining? One: Use one-way Χ 2 Two: Use two-way Χ 2
Assumptions Mutually exclusive categories One category per subject Independence of observations equivalent criteria for entrance into each category Sufficient Sample Size General rule: none of the expected frequencies be less than five
Chi-Square Chi-Square is used when subjects are measured using a nominal variable gender; political preference; handedness; nationality. With these variables, we do not measure an amount, but instead we count the frequency of observations in each of the categories are psychology majors more likely to vote Republican or Democrat? We are interested in proportions
Chi-Square In a Chi-Square test, we compare the actual frequencies of occurrence with what we would expect to happen by chance Two types of Chi-Square test One-way: we have one variable Two-way: we have two variables
One-Way Chi-Square We use this test when data consist of frequencies spread amongst different categories of a single variable Are men or women more likely to smoke? Is Coke or Pepsi more preferred by U.S. soda drinkers? Are different ethnic groups well represented at the UofA?
Problem In a Psyc 230 class, are there equal numbers of people from rural, suburban and urban backgrounds? Rural: 23 Suburban: 119 Urban: 50 3 levels, but still just one variable One-way Chi-Square
2. State the Hypotheses H 0 : all frequencies are equal Rural, suburban and urban backgrounds are equally represented in the class H A : not all frequencies are equal Rural, suburban and urban backgrounds are not equally represented in the class
3. Calculate the obtained value (χ 2 obt ) Rural Suburban Urban Total Observed (f o ) 23 119 50 192 Expected (f e )
3. Calculate the obtained value (χ 2 obt ) Rural Suburban Urban Total Observed (f o ) 23 119 50 192 Expected (f e ) 192/3 192/3 192/3 f e in eachcategory = N k
3. Calculate the obtained value (χ 2 obt ) Rural Suburban Urban Total Observed (f o ) 23 119 50 192 Expected (f e ) 64 64 64 192
3. Calculate the obtained value (χ 2 obt ) Rural Suburban Urban Total Observed (f o ) 23 119 50 192 Expected (f e ) 64 64 64 192 χ 2 = Σ f o f e 2 f e
3. Calculate the obtained value (χ 2 obt ) Rural Suburban Urban Total Observed (f o ) 23 119 50 192 Expected (f e ) 64 64 64 192 χ 2 o b t = Σ[(f o -f e ) 2 / f e ] χ 2 = Σ f o f e 2 f e [(23-64) 2 / 64] + [(119-64) 2 / 64] + [(50-64) 2 / 64] = [(-41) 2 / 64] + [(55) 2 / 64] + [(-14) 2 / 64] = [ 1681 / 64] + [ 3025 / 64] + [ 196 / 64] = 26.266 + 47.266 + 3.062 = 76.594
4. Calculate the critical value Assume α=0.05 Always a two-tail test with Chi-square Look up Table Q on page 572 Critical values of Chi-square: The χ 2 tables df = k - 1 = 3-1 = 2 χ 2 c r i t for 2 degrees of freedom and α = 0.05 is 5.99
χ 2 c r it and χ 2 obt χ 2 c r it = 5.99 χ 2 o b t = 76.594
5. Make our Conclusion χ 2 c r it = 5.99 χ 2 obt = 76.594 χ 2 obt is inside the rejection region, so we reject H 0 and accept H A We conclude that there is a significant difference in frequencies between our groups Rural, suburban and urban backgrounds are not equally represented in our class
Problem In a Psyc 230 class, does the proportion of males and females in the class match the proportion of males and females who are psychology majors? Psych majors: 75% female, 25% male Females in the class: 142 Males in the class: 56
2. State the Hypotheses The exact hypothesis will depend on the specific question we are asking H 0 : frequencies in the class are equal to the psych population 75% of our class are female, 25% are male H A : frequencies in the class are not equal to the psych population 75% of our class are not female and 25% are not male
3. Calculate the obtained value (χ 2 obt ) Females Males Total Observed (f o ) 142 56 198 Expected (f e )
3. Calculate the obtained value (χ 2 obt ) Females Males Total Observed (f o ) 142 56 198 Expected (f e ) (198*0.75) (198*0.25)
3. Calculate the obtained value (χ 2 obt ) Females Males Total Observed (f o ) 142 56 198 Expected (f e ) 148.5 49.5 198
3. Calculate the obtained value (χ 2 obt ) Females Males Total Observed (f o ) 142 56 198 Expected (f e ) 148.5 49.5 198 χ 2 o b t = Σ[(f o -f e ) 2 / f e ] χ 2 = Σ f o f e 2 f e [(142-148.5) 2 / 148.5] + [(56-49.5) 2 / 49.5] = [(-6.5) 2 / 148.5] + [(6.5) 2 / 49.5] = [ 42.25 / 148.5] + [ 42.25 / 49.5] = 0.284 + 0.853 = 1.137
4. Calculate the critical value Assume α=0.05 Always a two-tail test with Chi-square Look up Table Q Critical values of Chi-square: The χ 2 tables df = k - 1 = 2-1 = 1 χ 2 crit for 1 degree of freedom and α = 0.05 is 3.84
χ 2 c r it and χ 2 obt χ 2 c r it = 3.84 χ 2 o b t = 1.137
5. Make our Conclusion χ 2 c r it = 3.84 χ 2 obt = 1.137 χ 2 obt is outside the rejection region, so we retain H 0 We conclude that there is no significant difference in frequencies between our groups as compared to the population of psychology majors males and females are represented equally to their distribution in the population
Problem - Your turn After conducting a survey of tastes with 30 subjects, researchers found the following preference for soda: 18 people preferred Coke 12 people preferred Pepsi Is there a significant difference in the number of people who preferred Coke over Pepsi? Use α=0.05. f e in eachcategory= N k χ 2 = Σ f o f e 2 f e
2. State the Hypotheses H 0 : all frequencies are equal Coke and Pepsi are equally represented H A : not all frequencies are equal Coke and Pepsi are equally not equally represented
3. Calculate the obtained value (χ 2 obt ) Coke Pepsi Total Observed (f o ) 18 12 30 Expected (f e ) 15 15 30
3. Calculate the obtained value (χ 2 obt ) Coke Pepsi Total Observed (f o ) 18 12 30 Expected (f e ) 15 15 30 χ 2 o b t = Σ[(f o -f e ) 2 / f e ] χ 2 = Σ f o f e 2 f e [(18-15) 2 / 15] + [(12-15) 2 / 15] = [(3) 2 /15] + [(-3) 2 /15] = [ 9 / 15 ] + [ 9 / 15 ] = 0.6 + 0.6 = 1.2
4. Calculate the critical value Assume α=0.05 Always a two-tail test with Chi-square Look up Table Q page 572 Critical values of Chi-square: The χ 2 tables df = k - 1 = 2-1 = 1 χ 2 crit for 1 degree of freedom and α = 0.05 is 3.84
χ 2 c r it and χ 2 obt χ 2 crit = 3.84 χ 2 o b t = 1.2
5. Make our Conclusion χ 2 c r it = 3.84 χ 2 obt = 1.2 χ 2 obt is outside the rejection region, so we retain H 0 We conclude that there is no significant difference in frequencies between our groups Coke and Pepsi are equally preferred
One-Way Chi-Square We use this test when data consist of frequencies spread amongst different categories of a single variable Are men or women more likely to smoke? Is Coke or Pepsi more preferred by U.S. soda drinkers? How different ethnic groups well represented at the UofA?
Chi-Square What happens when we want to examine relationships between two variables? Are higher or lower levels of self-esteem more likely in athletes or non-athletes? Is season of birth related to the likelihood of suffering from Schizophrenia? Do men and women have different preferences for thin or heavy body-types?
Two-Way Chi-Square To answer questions concerning the relationship between two variables we use a two-way Chi-Square test Same logic as before - we will estimate what we would expect to see for each category if the null hypothesis was true, and then compare that to what we actually observed for each category
Problem Researchers are interested in the college experience of those who were the first in their family to attend college. One variable measured was if the students dropped out in their first semester. Is there a relationship between whether the student was the first in their family to go to college and whether they dropped out? Variables?
Problem First to go to college, dropped out: 15 First to go to college, didn t drop out: 50 Not first to go to college, dropped out: 15 Not first to go to college, didn t drop out: 120
Problem First to go to college, dropped out: 15 First to go to college, didn t drop out: 50 Not first to go to college, dropped out: 15 Not first to go to college, didn t drop out: 120 First Not First Total Dropped out 15 15 30 Didn t drop out 50 120 170 65 135 200
2. State the Hypotheses H 0 : there is no relationship between the two variables whether you were first in your family to go to college and whether you drop out are not related (the variables are independent) H A : there is a relationship between the two variables whether you were first in your family to go to college and whether you drop out are related (the relationship is not solely due to chance)
3. Calculate the obtained value (χ 2 ) obt f o (f e ) First Not First Total Dropped out 15 15 30 Didn t drop out 50 120 170 65 135 200
3. Calculate the obtained value (χ 2 ) obt f o (f e ) First Not First Total Dropped out 15 15 30 Didn t drop out 50 120 170 65 135 200 f e = cell'srow total f o cell'scolum ntotal f o N
3. Calculate the obtained value (χ 2 ) obt f o (f e ) First Not First Total Dropped out 15 15 30 Didn t drop out 50 120 170 65 135 200 f e = cell'srow total f o cell'scolum ntotal f o N Dropped out, first: f e = (30)(65) / 200 = 1950 / 200 = 9.75, round to 10 Dropped out, not first: f e = (30)(135) / 200 = 4050 / 200 = 20.25, round to 20 Didn t drop out, first: f e = (170)(65) / 200 = 11050 / 200 = 55.25, round to 55 Didn t drop out, not first: f e = (170)(135) / 200 = 22950 / 200 = 114.75, round to 115
3. Calculate the obtained value (χ 2 ) obt f o (f e ) First Not First Total Dropped out 15 (10) 15 (20) 30 Didn t drop out 50 (55) 120 (115) 170 65 135 200
3. Calculate the obtained value (χ 2 ) obt f o (f e ) First Not First Total Dropped out 15 (10) 15 (20) 30 Didn t drop out 50 (55) 120 (115) 170 χ 2 o b t = Σ[(f o -f e ) 2 / f e ] 65 135 200 χ 2 = Σ f o f e 2 f e [(15-10) 2 / 10] + [(15-20) 2 / 20] + [(50-55) 2 / 55] + [(120-115) 2 / 115] = [(5) 2 / 10] + [(-5) 2 / 20] + [(-5) 2 / 55] + [(-5) 2 / 115] = [ 25 / 10] + [ 25 / 20] + [ 25 / 55] + [ 25 / 115] = 2.5 + 1.25 + 0.45 + 0.22 = 4.42
4. Calculate the critical value Assume α=0.05 Always a two-tail test with Chi-square Look up Table Q Critical values of Chi-square: The χ 2 tables df = (number of rows - 1)(number of columns -1) = (2-1)(2-1) = 1 χ 2 c r it for 1 degree of freedom and α = 0.05 is 3.84
χ 2 c r it and χ 2 obt χ 2 c r it = 3.84 χ 2 o b t = 4.42
5. Make our Conclusion χ 2 c r i t = 3.84 χ 2 obt = 4.42 χ 2 obt is inside the rejection region, so we reject H 0 and accept H A We conclude that there is a significant relationship between likelihood of dropping out of college and whether you were the first in your family to go more likely to drop out if you were the first in your family to attend college
Problem Researchers hypothesize that personality traits may be related to preference for color. In a study, researchers asked each subject to choose their favorite color from a choice of red, yellow, green and blue, and then tested whether the subject was either introverted or extroverted. Is there a relationship between these variables? Variables?
Problem Red Yellow Green Blue Introvert 10 3 15 22 Extrovert 90 17 25 18
Problem Red Yellow Green Blue Introvert 10 3 15 22 50 Extrovert 90 17 25 18 150 100 20 40 40
2. State the Hypotheses H 0 : there is no relationship between the two variables your personality type (introvert or extrovert) is not related to your favorite color (the variables are independent) H A : there is a relationship between the two variables your personality type (introvert or extrovert) is related to your favorite color (the variables are independent) the relationship is not solely due to chance
3. Calculate the obtained value (χ 2 ) obt f o (f e ) Red Yellow Green Blue Introvert 10 3 15 22 50 Extrovert 90 17 25 18 150 100 20 40 40 200
3. Calculate the obtained value (χ 2 obt ) f o (f e ) Red Yellow Green Blue Introvert 10 3 15 22 50 Extrovert 90 17 25 18 150 100 20 40 40 200 f e = cell'srow total f o cell'scolum ntotal f o N
3. Calculate the obtained value (χ 2 ) obt f o (f e ) Red Yellow Green Blue Introvert 10 3 15 22 50 Extrovert 90 17 25 18 150 100 20 40 40 200 Introvert, Red : f e = (50)(100) / 200 = 5000 / 200 = 25 Introvert, Yellow : f e = (50)(20) / 200 = 1000 / 200 = 5 Introvert, Green : f e = (50)(40) / 200 = 2000 / 200 = 10 Introvert, Blue : f e = (50)(40) / 200 = 2000 / 200 = 10
3. Calculate the obtained value (χ 2 obt ) f o (f e ) Red Yellow Green Blue Introvert 10 3 15 22 50 Extrovert 90 17 25 18 150 100 20 40 40 200 f e = cell'srow total f o cell'scolum ntotal f o N Extrovert, Red : f e = (150)(100) / 200 = 15000 / 200 = 75 Extrovert, Yellow : f e = (150)(20) / 200 = 3000 / 200 = 15 Extrovert, Green : f e = (150)(40) / 200 = 6000 / 200 = 30 Extrovert, Green : f e = (150)(40) / 200 = 6000 / 200 = 30
3. Calculate the obtained value (χ 2 obt ) f o (f e ) Red Yellow Green Blue Introvert 10 (25) 3 (5) 15 (10) 22 (10) 50 Extrovert 90 (75) 17 (15) 25 (30) 18 (30) 150 100 20 40 40 200 f e = cell'srow total f o cell'scolum ntotal f o N
3. Calculate the obtained value (χ 2 ) obt f o (f e ) Red Yellow Green Blue Introvert 10 (25) 3 (5) 15 (10) 22 (10) 50 Extrovert 90 (75) 17 (15) 25 (30) 18 (30) 150 100 20 40 40 200 χ 2 o b t = Σ[(f o -f e ) 2 / f e ] χ 2 = Σ f o f e 2 f e [(10-25) 2 / 25] + [(3-5) 2 / 5] + [(15-10) 2 / 10] + [(22-10) 2 / 10] + [(90-75) 2 / 75] + [(17-15) 2 / 15] + [(25-30) 2 / 30] + [(18-30) 2 / 30] =
3. Calculate the obtained value (χ 2 ) obt f o (f e ) Red Yellow Green Blue Introvert 10 (25) 3 (5) 15 (10) 22 (10) 50 Extrovert 90 (75) 17 (15) 25 (30) 18 (30) 150 100 20 40 40 200 χ 2 o b t = Σ[(f o -f e ) 2 / f e ] χ 2 = Σ f o f e 2 f e [(-15) 2 / 25] + [(-2) 2 / 5] + [(5) 2 / 10] + [(12) 2 / 10] + [(15) 2 / 75] + [(2) 2 / 15] + [(-5) 2 / 30] + [(-12) 2 / 30] = [225 / 25] + [4 / 5] + [25 / 10] + [144 / 10] + [225 / 75] + [4 / 15] + [25 / 30] + [144 / 30] = 9 + 0.8 + 2.5 + 14.4 + 3 + 0.27 + 0.83 + 4.8 = 35.6
4. Calculate the critical value Assume α=0.05 Always a two-tail test with Chi-square Look up Table Q Critical values of Chi-square: The χ 2 tables df = (number of rows - 1)(number of columns -1) = (2-1)(4-1) = 3 χ 2 c r it for 3 degrees of freedom and α = 0.05 is 7.81
χ 2 c r it and χ 2 obt χ 2 c r it = 7.81 χ 2 o b t = 35.6
χ 2 c r it = 7.81 χ 2 o b t = 35.6 5. Make our Conclusion χ 2 o b t is inside the rejection region, so we reject H 0 and accept H A We conclude that there is a significant relationship between color choice and extroverted/introverted personality types
Final Exam Questions Multiple-choice (30 questions @ 2 points each): 60 total Short-answer (4 questions @ 10 points each): 40 total Some questions from throughout the semester You ll be provided with all required formulas and tables Remember to bring calculators and pencils
Populations and Samples A population is all possible members of the group of interest A sample is a subset of the population
Descriptive and Inferential Statistics Descriptive statistics procedures which organize and summarize sample data Inferential statistics procedures for drawing inferences about populations
Statistics and Parameters Statistic a number that describes an aspect of a sample of scores Parameter a number that describes an aspect of a population of scores often inferred through sampling
Experimental Studies In a true experiment, the researcher actively changes or manipulates one variable and then measures participants scores on another variable to see if a relationship is produced example: the effect of alcohol on stats test scores Two types of variable: independent variable manipulated a variable the experimenter actually manipulates (e.g. treatment condition) subject a measurable aspect of the individual participants which the experimenter does not change (e.g. sex) dependent variable
Correlational Studies The researcher measures participants scores on two variables and then determines whether a relationship is present
Which Scale? 1. Does the variable have an intrinsic value? NO ==> Nominal 2. Does the variable have equal values between scores? NO ==> Ordinal 3. Does the variable have a real zero point? NO ==> Interval YES ==> Ratio
Things you need to know Be able to describe a normal distribution Know the difference between a null and alternative hypothesis Know the difference between type 1 and type two error A Type I error is defined as rejecting H 0 when H 0 is actually true A Type II error is defined as retaining H 0 when H 0 is false (and H 1 is actually true) Correlation is not causation
Things you need to know (continued) A linear relationship forms a pattern on a scatterplot that fits a straight line In a positive linear relationship, as the scores on the X variable increase, the scores on the Y variable also tend to increase In a negative linear relationship, as the scores on the X variable increase, the scores on the Y variable tend to decrease
Things you need to know (continued) The assumptions of chi square Mutually exclusive categories One category per subject Independence of observations equivalent criteria for entrance into each category Sufficient Sample Size General rule: none of the expected frequencies be less than five
Know how to use this decision tree Are we comparing a sample to a population? Yes: Z-test if we know the population standard deviation Yes: One-sample T-test if we do not know the population std dev No: Keep looking Are we looking for the difference between samples? Yes: How many samples are we comparing? Two: Use the Two-sample T-test Are the samples independent or related?» Independent: Use Independent Samples T-test» Related: Use Related Samples T-test More than Two Groups: Keep looking How many independent variables One: Use a One-way ANOVA Two: Use a Two-factor ANOVA No: Keep looking Are we looking for the relationship between variables? Yes: Use the Correlation test Are we examining the frequencies of categorical, mutually exclusive classes? Yes: How many variables are we examining? One: Use one-way Χ 2 Two: Use two-way Χ 2
Know how to (continued) calculate an arithmetic mean and standard deviation use the various tables in the back of the book related to material covered. create a scatter plot correctly from raw data
Know how to (continued) use the formula to calculate r. calculate r 2 calculate Y or X draw a regression line when given only the slope and intercept Calculate the obtained value (χ 2 o b t ) both one-way and two way
Know How to interpret ANOVA results and terms If I give you a complete F table tell me what that means conceptually If I give you the results of a post-hoc test interpret those results