Comparing Two Populations. Topic 15 - Two Sample Inference I. Comparing Two Means. Comparing Two Pop Means. Background Reading

Topic 15 - Two Sample Iferece I STAT 511 Professor Bruce Craig Comparig Two Populatios Research ofte ivolves the compariso of two or more samples from differet populatios Graphical summaries provide visual compariso of populatios Boxplots Histograms Backgroud Readig Devore : Sectio 9.1-9. Stem-ad-Leaf Plots Distributio may differ i terms of ceter or spread 15 15-1 Comparig Two Pop Meas First cosider compariso of ceters (meas Cosider the followig otatio Iterested i estimatig µ 1 µ Will use x y as the estimate Need the stadard error of this estimate Populatio 1 Populatio Pop Size N 1 N Mea µ 1 µ Std. Dev. 1 Sample Size m Sample Mea x y Sample StDev s 1 s Need to costruct samp dist of X Y Recall if X ormal the X ormal with Mea: µ 1 ad Std Dev: 1 / m Ad if Y ormal the Y ormal with Mea: µ ad Std Dev: / 15-15-3

Liear combiatio of ormal RV s is also also applied to the differece (page 45 Give X s ad Y s are idep, X Y has Differece of Two Meas Mea: µ 1 µ (1 α100% Cofidece Iterval of µ 1 µ Stadard deviatio: X Y = 1 m + (x y ± z 1 α/ m + What do we coclude if 0 is i the iterval? Will use this samplig distributio for costructio of cofidece itervals ad hypothesis tests 15-4 15-5 Hypothesis Testig Similar procedure as oe populatio case H 0 : µ 1 µ = 0 µ 1 µ > 0 H a : µ 1 µ < 0 µ 1 µ 0 Cosumer Reports decided to compare several differet brads of batteries. Suppose that 100 Duracell Alkalie AA batteries had a average lifetime of 4.1 hours ad 100 Eergizer Alkalie AA batteries had a average lifetime of 4.5 hours. If it were assumed that the two distributios were ormally distributed with stadard deviatios 1.0 ad 1.35 hours respectively, is there evidece that the two types of batteries have a differet average lifetime? Will use x y for test H 0 : µ 1 µ =0 H a : µ 1 µ 0 Will assess ocompatibility usig stadardizatio (x y 0 1 m + = z s Give z s, P-value computed i usual way 15-6 z = (4.1 4.5 0 1.0 100 + 1.35 100 =.4 0.18 =.1 The P-value is P ( Z >.1 = (.0136 = 0.07. For α>.07, we would reject H 0 ad coclude there is a differece. It appears the Eergizer lasts loger. 15-7

Type II Error / Power Followig formulae result for β H a : µ 1 µ > 0 Φ(z α + 0 H a : µ 1 µ < 0 1 Φ( z α + 0 H a : µ 1 µ 0 Φ(z α/ + 0 Φ( z α/ + 0 where Φ(z c =P(Z <z c = µ 1 µ uder H a is the stadard error of X Y Cosiderig the previous compariso of battery brads, what is the power of detectig the differece whe the true differece is at least 30 miutes (0.5 hours ad we use α =.01? Sice H a : µ 1 µ 0, the formula (α =.01 gives us β = Φ(.576 0.5 0.5 Φ(.576 0.18 0.18 = Φ(.0 Φ( 5.35 0.407 Thus the power is 1-.407 or 51.83%. Would the power icrease or decrease if α =.05? NOTE: Formulas for sample size follow same geeral rules. Whe m = the formula is give o page 366. 15-8 15-9 ( s ukow ( s ukow With large samples Cetral Limit Thm X Normal Estimate s 1 ad s ca be used for variaces (1 α100% Cofidece Iterval of µ 1 µ Will use t dist istead of stadard Normal Must estimate the stadard error Two possible estimates 1. The Upooled Method (Variaces Uequal s (x y ± z 1 α/ m + s s ˆ X Y = 1 m + s. The Pooled Method (Variaces Equal Hypothesis test statistic z s = (x y 0 s 1 m + s s p = (m 1s 1 +( 1s m + s p ˆ = m + s p 15-10 15-11

s Sample 1 Sample x =5 y =7 s 1 =0.0 s =0.8 1 =0 =0 Whe 1 = both methods give idetical aswer µ 1 µ Upooled. /0 +.8 /0 =.077 s pooled = ((0 1. + (0 1.8 /(0 + 0 Pooled.059(1/0 + 1/0 =.077 Sample 1 Sample y 1 =5 y =7 s 1 =0.0 s =0.8 1 =10 =0 Whe 1, choice based o assumptio 1 = Upooled. /10 +.8 /0 =.089 Recall if Y ormal but ukow y ± t α/ s / df = 1 If X, Y ormal with 1 =,CIis (x y ± t α/ s p( 1 m + 1 df = m + s pooled = ((10 1. + (0 1.8 /(10 + 0 Pooled.066(1/10 + 1/0 =.099 15-1 15-13 µ 1 µ If you do ot assume 1 =,theciis (y 1 y ± t α/ s 1/m + s / where ( SE 1 +SE df = SE 4 1/(m 1 + SE 4 /( 1 SE 1 = s 1/ m, SE = s / 15-14 Wat to compare serum iro levels (µmol/i for the populatio of healthy childre ad the populatio of childre with cystic fibrosis. A sample of m =9 healthy childre resulted i x =18.9 ads 1 =5.9 ad a sample of = 13 childre with cystic fibrosis resulted i y =11.9 ads =6.3, what is the 90% CI for the differece i mea levels? Will assume the variaces are equal because s 1 ad s are ot very differet. The pooled variace is s pooled = (9 15.9 + (13 16.3 =37.74 9+13 The stadard error is 37.74( 1 9 + 1 13 =.66. For df=9+13-=0 degrees of freedom, the cofidece coefficiet is 1.75. The cofidece iterval is (18.9 11.9 ± 1.75(.66 = (.4, 11.6 Recall that sice zero is ot i this iterval, the two meas are statistically differet at the α =.1 sigificace level. Lookig at the sample meas, the childre i the healthy populatio o average have higher serum iro levels tha those with cystic fibrosis. 15-15

Hypothesis Testig Similar procedure as oe populatio case H 0 : µ 1 µ = 0 µ 1 µ > 0 H a : µ 1 µ < 0 µ 1 µ 0 t s = (x y 0 s p( 1 m + 1 Compare t s to the t dist with m + df If o poolig, must alter std error ad df Assumptios Both CI ad hypothesis tests assume 1 Samples are idepedet Very importat!! Ofte accomplished through radom samplig Beware of subsamplig/hierarchical samplig Populatios are Normally distributed Less critical Cetral Limit Thm allows for approximatio 3 Variaces are equal Statistical tests available to compare variaces Oly ecessary whe usig pooled SE If 1 =, works well if 1/ <s 1 /s < 15-16 15-17 A ew tapeworm medicie has bee developed. To test its effectiveess, a experimet was coducted to compare the mea umber of tapeworms i sheep treated with this medicie agaist the mea umber i those that were ot (cotrol. A sample of 10 ifected sheep were radomly divided ito two groups. After four moths, the sheep were slaughtered ad the followig couts were recorded. Is this drug effective at the.05 level? Treated 13 15 0 15 17 Utreated 17 1 19 0 The sample statistics are x = 16, s 1 = 7, y = 19.8, ad s = 3.7. I order to use the t distributio, we must assume that the two populatios are Normal ad each observatio i the sample ad the two samples are idepedet. We may also decide to assume the variaces are the same. With such a small data set, it is difficult to test for Normality so we ll assume it is true. As for idepedece, we ll assume the sheep were kept i separate locatios so the couts are idepedet. Sice the sample variaces do t differ by more tha a factor of, we pool the variaces. H 0 : µ 1 µ = 0 (Treated - Cotrol H A : µ 1 µ < 0 The test statistic is s pooled = 7+3.7 =5.35 16 19.8 5.35( 15 + 15 =.60 Sice the df=8, the P-value is betwee.01 ad.05 (1 sided. Therefore, we reject the Null hypothesis. There is evidece that this drug reduces the umber of tapeworms. NOTE: If we did t wat to assume the variaces are equal, the test statistic is the same but the degrees of freedom is ow 7 (use formula. The coclusios do ot chage. 15-18 15-19