Chapter 22. Comparing Two Proportions. Copyright 2010 Pearson Education, Inc.

Comparig Two Proportios Comparisos betwee two percetages are much more commo tha questios about isolated percetages. Ad they are more iterestig. We ofte wat to kow how two groups differ, whether a treatmet is better tha a placebo cotrol, or whether this year s results are better tha last year s. Copyright 2010 Pearso Educatio, Ic. Slide 22-3

Aother Ruler I order to examie the differece betwee two proportios, we eed aother ruler the stadard deviatio of the samplig distributio model for the differece betwee two proportios. Recall that stadard deviatios do t add, but variaces do. I fact, the variace of the sum or differece of two idepedet radom quatities is the sum of their idividual variaces. Copyright 2010 Pearso Educatio, Ic. Slide 22-4

The Stadard Deviatio of the Differece Betwee Two Proportios Proportios observed i idepedet radom samples are idepedet. Thus, we ca add their variaces. So The stadard deviatio of the differece betwee two sample proportios is SD p 1 2 ˆ pˆ Thus, the stadard error is SE p p q p q 1 2 ˆ pˆ pˆ qˆ pˆ qˆ Copyright 2010 Pearso Educatio, Ic. Slide 22-5

Assumptios ad Coditios Idepedece Assumptios: Radomizatio Coditio: The data i each group should be draw idepedetly ad at radom from a homogeeous populatio or geerated by a radomized comparative experimet. The 10% Coditio: If the data are sampled without replacemet, the sample should ot exceed 10% of the populatio. Idepedet Groups Assumptio: The two groups we re comparig must be idepedet of each other. Copyright 2010 Pearso Educatio, Ic. Slide 22-6

Assumptios ad Coditios (cot.) Sample Size Coditio: Each of the groups must be big eough Success/Failure Coditio: Both groups are big eough that at least 10 successes ad at least 10 failures have bee observed i each. Copyright 2010 Pearso Educatio, Ic. Slide 22-7

The Samplig Distributio We already kow that for large eough samples, each of our proportios has a approximately Normal samplig distributio. The same is true of their differece. Copyright 2010 Pearso Educatio, Ic. Slide 22-8

The Samplig Distributio (cot.) Provided that the sampled values are idepedet, the samples are idepedet, ad the samples sizes are large eough, the samplig distributio of pˆ pˆ is modeled by a Normal model with Mea: p Stadard deviatio: p q SD pˆ ˆ 1 p2 Copyright 2010 Pearso Educatio, Ic. Slide 22-9 p p q 1 2

Two-Proportio z-iterval Whe the coditios are met, we are ready to fid the cofidece iterval for the differece of two proportios: The cofidece iterval is where pˆ pˆ z SE pˆ pˆ SE p 1 2 ˆ pˆ pˆ qˆ pˆ qˆ The critical value z* depeds o the particular cofidece level, C, that you specify. Copyright 2010 Pearso Educatio, Ic. Slide 22-10

Everyoe ito the Pool The typical hypothesis test for the differece i two proportios is the oe of o differece. I symbols, H 0 : p 1 p 2 = 0. Sice we are hypothesizig that there is o differece betwee the two proportios, that meas that the stadard deviatios for each proportio are the same. Sice this is the case, we combie (pool) the couts to get oe overall proportio. Copyright 2010 Pearso Educatio, Ic. Slide 22-11

Everyoe ito the Pool (cot.) The pooled proportio is Success Success pˆ pooled where Success pˆ ad Success 2 ˆ 2 p2 1 1 1 If the umbers of successes are ot whole umbers, roud them first. (This is the oly time you should roud values i the middle of a calculatio.) Copyright 2010 Pearso Educatio, Ic. Slide 22-12

Everyoe ito the Pool (cot.) We the put this pooled value ito the formula, substitutig it for both sample proportios i the stadard error formula: pˆ qˆ pˆ qˆ SE ˆ ˆ pooled p1 p2 pooled pooled pooled pooled Copyright 2010 Pearso Educatio, Ic. Slide 22-13

Compared to What? We ll reject our ull hypothesis if we see a large eough differece i the two proportios. How ca we decide whether the differece we see is large? Just compare it with its stadard deviatio. Ulike previous hypothesis testig situatios, the ull hypothesis does t provide a stadard deviatio, so we ll use a stadard error (here, pooled). Copyright 2010 Pearso Educatio, Ic. Slide 22-14

Two-Proportio z-test The coditios for the two-proportio z-test are the same as for the two-proportio z-iterval. We are testig the hypothesis H 0 : p 1 p 2 = 0, or, equivaletly, H 0 : p 1 = p 2. Because we hypothesize that the proportios are equal, we pool them to fid Success Success pˆ pooled Copyright 2010 Pearso Educatio, Ic. Slide 22-15

Two-Proportio z-test (cot.) We use the pooled value to estimate the stadard error: pˆ qˆ pˆ qˆ SE ˆ ˆ pooled p1 p2 Now we fid the test statistic: pooled pooled pooled pooled z ( pˆ pˆ ) 0 SE ( pˆ pˆ ) pooled Whe the coditios are met ad the ull hypothesis is true, this statistic follows the stadard Normal model, so we ca use that model to obtai a P-value. Copyright 2010 Pearso Educatio, Ic. Slide 22-16

What Ca Go Wrog? Do t use two-sample proportio methods whe the samples are t idepedet. These methods give wrog aswers whe the idepedece assumptio is violated. Do t apply iferece methods whe there was o radomizatio. Our data must come from represetative radom samples or from a properly radomized experimet. Do t iterpret a sigificat differece i proportios causally. Be careful ot to jump to coclusios about causality. Copyright 2010 Pearso Educatio, Ic. Slide 22-17

What have we leared? We ve ow looked at iferece for the differece i two proportios. Perhaps the most importat thig to remember is that the cocepts ad iterpretatios are essetially the same oly the mechaics have chaged slightly. Copyright 2010 Pearso Educatio, Ic. Slide 22-18

What have we leared? (cot.) Hypothesis tests ad cofidece itervals for the differece i two proportios are based o Normal models. Both require us to fid the stadard error of the differece i two proportios. We do that by addig the variaces of the two sample proportios, assumig our two groups are idepedet. Whe we test a hypothesis that the two proportios are equal, we pool the sample data; for cofidece itervals we do t pool. Copyright 2010 Pearso Educatio, Ic. Slide 22-19