Sociology 54 Testig for differeces betwee two samle meas Cocetually, comarig meas from two differet samles is the same as what we ve doe i oe-samle tests, ecet that ow the hyotheses focus o the arameters of two oulatios. To make comarisos for two oulatios, cosider whether the samles are ideedet or deedet. Ideedet samles: Selectio of members of oe samle has o ifluece o the selectio of members of the other samle. Deedet samles: Selectio of members for oe samle determies the characteristics of the members for the other samle. Whe we comare two grous, we still state two cometig hyotheses. With two samles, we are ow dealig with a samlig distributio of differeces betwee samle meas. The outcome of iterest is the differece betwee two samle statistics (e.g. the differece i mea hours set o housework betwee me ad wome). Just as the samlig distributio of samle meas is ormal, so is the samlig distributio of differeces betwee samle meas (a corollary of the Cetral Limit Theorem). Coductig a hyothesis test comarig two samles: Calculatig a test statistic to determie whether the differece betwee two samle meas is real or simly due to chace variatio is cocetually the same as what we ve already reviewed for a sigle samle. To comare two grous o a quatitative characteristic, we make ifereces about their oulatio meas µ ad µ ad the differece betwee them. Test statistic = (observed differece - eected differece) / amout of variability i samlig distributio of differeces t = ( ) ( µ µ ) ˆ Note that idetificatio of which grou is called ad which is called is arbitrary. What is the stadard error of the samlig distributio of the estimated differece betwee the two samle meas? That is, the degree to which the differece would vary if we reeatedly took samles of size ad.
Comarig samle meas (iterval measures) of two ideedet LARGE samles with uequal variaces (>00 for each samle) s s = Where s = stadard deviatio of samle s = stadard deviatio of samle = size of samle = size of samle Whe two estimates are formed from ideedet samles, the samlig distributio of their differece has variace equal to the sum of the variaces of the samlig distributio of the searate estimates. The corresodig test statistic is: z = ( ) ( µ µ ) ˆ Recall that oce you ve calculated the test statistic for your two samles, you ca fid the aroriate -value that corresods to that statistic. That is, what is the likelihood that you would draw two samles with a differece as large as that observed, if i fact there really were o differece betwee the two grous? If the robability is small, we may reject the ull hyothesis ad tetatively accet the research hyothesis. Guidelies to test for statistically sigificat differeces betwee meas of two grous.. Choose a sigificace (α) level.. State hyotheses. 3. Calculate the aroriate test statistic. 4. Determie the -value associated with the test statistic. 5. Draw coclusios.
Two ideedet LARGE samles A samle of 000 studets draw from a ublic uiversity fids that studets work. hours er week while a samle of 900 studets draw from a rivate uiversity fids that their studets work a average of 9. hours er week. The samle stadard deviatios are 0.8 hours for the ublic uiversity ad 9.6 hours for the rivate uiversity. Is there a sigificat differece i the umber of hours worked i ublic versus rivate uiversities? 3
4 Comarig samle meas for two ideedet SMALL samles (iterval measures). You must make the assumtio that the oulatio variaces are equal to use this formula (Kurtz. 83-86). First, obtai a ooled estimate of the stadard deviatio for the two grous: Although we assume that the samle variaces are equal, to obtai the best ossible estimate of the oulatio variace, we take a weighted average of the two variaces rather tha arbitrarily choose oe of them as the estimate. The obtai the estimated stadard error of the samlig distributio of differeces usig the ooled estimate of the stadard deviatio: This is equivalet to: The aroriate test statistic is: ˆ ) ( ) ( t = µ µ with degrees of freedom = - ) ( ) ( ˆ = s s ˆ ˆ = ˆ ˆ =
Two ideedet small samles (assume oulatio variaces are equal) The followig two samles idicate salaries for male ad female rofessors (i 969!). Could these differeces i salary arise just by chace? Samle of male rofessors: Samle of female rofessors: N=0 N=5 = 6 (salary figure i 000's) = s = 3.5 s =.83 5
6 Comarig roortios for two ideedet large samles (Kurtz.9-95) Let's say that you have two ideedet samles with dichotomous measures. Is that dichotomous variable distributed similarly i the two oulatios? We are ow comarig samles with a qualitative resose variable. This test requires that both samles have 30 or more members, ad the resultig statistic is a z score. The test statistic is the ratio of the differece betwee two samle roortios to the stadard error of the two roortios. The test statistic is estimated with the followig equatio: Test Statistic: ( ) = ) ) ( z c c where c =. c = is a weighted average of the two roortios to adjust if the two samles are of uequal size.
Two ideedet large samles (comarig roortios) A study foud the followig results: Of 3 male studets, 53% worked more tha 8 hours er week. Of 79 female studets, 48% worked more tha 8 hours er week. Is there a sigificat differece betwee male ad female studets i the umber of hours worked? 7
SPSS Oe-Samle t Test A ewly created radom umber geerator is suosed to geerate a sequece of digits such that each digit is equally likely to be ay of 0,,,, 9. The first 0 umbers geerated are: 7 7 3 0 5 6 3 6 0 9 9 4 0 8 5 0 6 As a check of whether the rocess works correctly, test whether the mea differs sigificatly from the value eected. Reort the -value ad iterret. You should also do this by had ad cofirm that you get the same results as those reorted by SPSS. SPSS Commads Aalyze - Comare Meas Oe-Samle T Test Select Test Variable (chage) ad Test Value ( ) Okay Oe-Samle Statistics NUMBERS Std. Error N Mea Std. Deviatio Mea 0 4.5 3.08.69 Oe-Samle Test NUMBERS Test Value = 4.5 95% Cofidece Iterval of the Mea Differece t df Sig. (-tailed) Differece Lower Uer -.508 9.67 -.35 -.79.09 8
Two-Samle t Test Usig GSS98.SAV file, determie whether there is a sigificat differece betwee me ad wome i their resose to the followig questio: ABANY: "Please tell me whether or ot you thik it should be ossible for a regat wome to obtai a legal abortio if... the woma wats it for ay reaso?" SEX: Resodet's Se Remember to use syta file set ritback o. Coduct test for differeces Aalyze Comare Meas Ideedet Samles T test Select test (ABANY) ad grou variables (SEX) - Okay Aother Two-Samle t-test Is there a sigificat differece by age (eole older tha ad youger tha 40) i the resose to the questio above? Recode age ito two categories (ages 0-39 ad 40 ad older) ad coduct a t-test. AGE: Age of Resodet 9
ADDITIONAL NOTES FOR SELF STUDY AND FUTURE REFERENCE Comarig two deedet samles with iterval measures (Kurtz. 95-98) Deedet samles occur whe each observatio i samle matches with a observatio from samle. (Ofte called matched-airs data). Most commoly occurs whe each samle has the same subjects. A eamle of reeated measuremet data. Studet's T test for Paired Comarisos A secial case of the oe-grou t test usig differece scores (e.g. differece i SAT scores from time to time ) from each air of deedet subjects. For matched-airs data, the differece betwee the meas of the two grous equals the mea of the differece scores. t = δ s δ δ = mea differece score S δ = stadard deviatio of differece score N = size of samle size Note: This formula is equivalet to that reseted o age 96 of Kurtz. See also otes o eamle of SAT scores ad effect of a re course discussed i earlier hadout. Some advatages to aalysis with deedet samles. Kow sources of otetial bias are cotrolled usig same subjects i each samle for eamle kees may ossible cofoudig factors fied.. Stadard error of differece may be smaller with deedet samles Assumtios. Radom ad ideedet samlig. Normality assumtio: Normality assumtio alies to oulatio of differece scores. The deedet grous t test is geerally cosidered robust agaist violatio of this assumtio if N > 30. 0
SPSS Oe-Samle t Test Use ch7.sav i soc 54 work directory Differece scores for studets who've take a SAT re course. File Name: Aalyze - Comare Meas Oe-Samle T Test Select Test Variable (chage) ad Test Value (0) Okay Oe-Samle Statistics CHANGE Std. Error N Mea Std. Deviatio Mea 0.0000 35.839.3333 Oe-Samle Test CHANGE Test Value = 0 95% Cofidece Iterval of the Mea Differece t df Sig. (-tailed) Differece Lower Uer.059 9.37.0000-3.6378 37.6378
You could also comare these two grous usig a aired samle t-test (these are deedet samles) Paired Samle t-test Usig the dataset cotaiig 0 observatios o re ad ost SAT scores, use the aired samle t- test to determie whether there is a sigificat differece betwee the two scores. Commads: Aalyze Comare Meas Paired Samle t-test Select aired variables (variable ad variable ) - Okay Syta: T-TEST PAIRS= origscre WITH ewscore (PAIRED) /CRITERIA=CIN(.95) /MISSING=ANALYSIS. Paired Samles Test Pair ORIGSCRE - NEWSCORE Mea Paired Differeces 95% Cofidece Iterval of the Std. Error Differece Std. Deviatio Mea Lower Uer t df Sig. (-tailed) -.0000 35.8395.33333-37.6378 3.6378 -.059 9.37 Note that these two tests yield eactly the same result.