A quick activity - Central Limit Theorem and Proportions. Lecture 21: Testing Proportions. Results from the GSS. Statistics and the General Population

A quick activity - Cetral Limit Theorem ad Proportios Lecture 21: Testig Proportios Statistics 10 Coli Rudel Flip a coi 30 times this is goig to get loud! Record the umber of heads you obtaied ad calculate the proportio of heads i the 30 flips. Whe you are doe come up ad mark the sample proportio o the dot plot. Usig my mystical statistical powers I predict that the distributio of sample proportios should be early ormally distributed with mea equal to approximately 0.5 ad a stadard deviatio of approximately 0.09. April 10, 2012 Statistics 10 (Coli Rudel) Lecture 21 April 10, 2012 2 / 17 Sigle populatio proportio Statistics ad the Geeral Populatio Results from the GSS Sigle populatio proportio Two scietists wat to kow if a certai drug is effective agaist high blood pressure. The first scietist wats to give the drug to 1000 people with high blood pressure ad see how may of them experiece lower blood pressure levels. The secod scietist wats to give the drug to 500 people with high blood pressure, ad ot give the drug to aother 500 people with high blood pressure, ad see how may i both groups experiece lower blood pressure levels. Which is the better way to test this drug? The Geeral Social Survey asks the same questio, below is the distributio of resposes from the 2010 survey: All 1000 get the drug 99 500 get the drug 500 do t 571 Total (a) All 1000 get the drug (b) 500 get the drug, 500 do t Statistics 10 (Coli Rudel) Lecture 21 April 10, 2012 3 / 17 Statistics 10 (Coli Rudel) Lecture 21 April 10, 2012 4 / 17

Parameter ad Poit Estimates Sigle populatio proportio Iferece o a Proportio Sigle populatio proportio We would like to estimate the proportio of all Americas who have a good ituitio about experimetal desig, i.e. would aswer 500 get the drug 500 do t? What are the parameter of iterest ad the poit estimate? Parameter of iterest: Proportio of all Americas who have a good ituitio about experimetal desig. p (a populatio proportio) Poit estimate: Proportio of sampled Americas who have a good ituitio about experimetal desig. ˆp (a sample proportio) Statistics 10 (Coli Rudel) Lecture 21 April 10, 2012 5 / 17 What percet of all Americas have a good ituitio about experimetal desig, i.e. would aswer 500 get the drug 500 do t? We ca aswer this research questio usig a cofidece iterval, which we kow are of the form poit estimate ± ME Ad we also kow from CI of meas that ME = critical value stadard error of the poit estimate. SEˆp =? Stadard error of a sample proportio p (1 p) SEˆp = Statistics 10 (Coli Rudel) Lecture 21 April 10, 2012 6 / 17 Idetifyig whe a sample proportio is early ormal Sample proportios are also early ormally distributed Back to experimetal desig... Cofidece itervals for a proportio The, accordig to the CLT: mea = p, SE = p (1 p) But of course this is true oly uder certai coditios... Idepedece Radomizatio 10% Coditio Nearly Normal Number of successes ( p) 10 Number of failures ( (1 p)) 10 If p is ukow (most cases), we use ˆp i the calculatio of the stadard error. The GSS foud that 571 out of (85%) of Americas aswered the questio o experimetal desig correctly. Estimate (usig a 95% cofidece iterval) the proportio of all Americas who have a good ituitio about experimetal desig. Give: =, ˆp = 0.85. First check assumptios & coditios. 1. Idepedece: The sample is radom, ad < 10% of all Americas, therefore we ca assume that oe respodet s respose is idepedet of aother. 2. Normality: 571 people aswered correctly (successes) ad 99 aswered icorrectly (failures), both are greater tha 10. Statistics 10 (Coli Rudel) Lecture 21 April 10, 2012 7 / 17 Statistics 10 (Coli Rudel) Lecture 21 April 10, 2012 8 / 17

Cofidece itervals for a proportio We are give that =, ˆp = 0.85, we also just leared that the stadard p(1 p) error of the sample proportio is SE =. Which of the below is the correct calculatio of the 95% cofidece iterval? (a) 0.85 ± 1.96 (b) 0.85 ± 1.65 0.85 0.15 0.85 0.15 (c) 0.85 ± 1.96 0.85 0.15 (d) 571 ± 1.96 571 99 Statistics 10 (Coli Rudel) Lecture 21 April 10, 2012 9 / 17 Choosig a sample size Choosig a sample size whe estimatig a proportio If the researchers were goig to coduct aother study o the same survey questio how may people should they sample i order to cut the margi of error of a 95% cofidece iterval dow to 1%? CI = ˆp ± ME ME = z SE = z p(1 p) z ˆp(1 ˆp) 0.01 0.85 0.15 1.96 Use estimate for ˆp from previous study 0.01 2 1.96 2 0.85 0.15 1.962 0.85 0.15 0.01 2 4898.04 should be at least 4,899 Statistics 10 (Coli Rudel) Lecture 21 April 10, 2012 10 / 17 What if there is t a previous study? Choosig a sample size whe estimatig a proportio Choosig a sample size, cot. Choosig a sample size whe estimatig a proportio What should the researchers do if they are plaig to ask a ew survey questio where they have o idea what the populatio proportio might be... use p = 0.5 Why? if you do t kow ay better, 50-50 is a good guess p = 0.5 gives the most coservative estimate p(1 p) is largest whe p = 0.5 which results i the largest possible p(1-p) 0.00 0.10 0.20 0.0 0.2 0.4 0.6 0.8 1.0 Statistics 10 (Coli Rudel) Lecturep 21 April 10, 2012 11 / 17 Previously we just saw that if the researchers what a margi of error less tha 1% they will eed to sample at least 4,899 people whe they expect the populatio proportio to be ear 85%. How does this chage whe we expect the populatio proportio to be ear 50%? 0.01 0.5 (1 0.5) 1.96 0.01 2 1.96 2 0.5 0.5 1.962 0.5 0.5 0.01 2 9604 which is almost double the sample size! Statistics 10 (Coli Rudel) Lecture 21 April 10, 2012 12 / 17

Choosig a sample size whe estimatig a proportio Choosig a sample size whe estimatig a proportio Back to the Cois Example - Legalizig Marijuaa Now that we kow how to calculate the cofidece iterval for a sample proportio, CI = ˆp ± z p (1 p) Calculate a 90% cofidece iterval based o the proportio of heads you observed i the 30 flips. Does this iterval iclude 50%? What does your result tell you about the fairess of your coi? Statistics 10 (Coli Rudel) Lecture 21 April 10, 2012 13 / 17 The 2010 Geeral Social Survey also asked the questio; Do you thik the use of marijuaa should be made legal, or ot? 48% of the 1,259 respodets said it should be made legal. a) Is the umber 48% a sample statistic or a populatio parameter? Explai. b) Costruct a 95% cofidece iterval for the proportio of Americas who thik marijuaa should be made legal. c) Iterpret this cofidece iterval i the cotext of this questio. d) A critic poits out that this 95% cofidece iterval is oly accurate if the statistic follows a ormal distributio, or if the ormal model is a good approximatio. Is this true for these data? Explai. e) A ews piece o this study s fidigs states; Majority of Americas thik marijuaa should be legalized. Based o your cofidece iterval, is this ews piece s statemet justified? Statistics 10 (Coli Rudel) Lecture 21 April 10, 2012 14 / 17 Hypotheses Hypothesis testig for a proportio Which of the followig are the correct set of hypotheses for testig if more tha 80% of Americas have a good ituitio about experimetal desig? Hypothesis testig for a proportio Hypothesis testig for a proportio mea = 0.80, SE = 0.80 0.20 = 0.0154 (a) H 0 : µ = 0.80 H A : µ > 0.80 (b) H 0 : p = 0.85 H A : p > 0.85 (c) H 0 : p = 0.80 H A : p > 0.80 (d) H 0 : ˆp = 0.80 H A : ˆp > 0.80 Note: The SE is differet, because ow we are coductig a hypothesis test assumig H 0 is true, ad H 0 says p = 0.80. 0.8 0.85 sample proportios Z = 0.85 0.80 0.0154 p value = P(Z > 3.25) = 3.25 = 1 0.9994 = 0.0006 Sice p-value is less tha 0.05 we reject H 0. The data provide covicig evidece that more tha 80% of Americas have a good ituitio o experimetal desig. Statistics 10 (Coli Rudel) Lecture 21 April 10, 2012 15 / 17 Statistics 10 (Coli Rudel) Lecture 21 April 10, 2012 16 / 17

Example - Legalizig Marijuaa Hypothesis testig for a proportio We ca also employ a hypothesis test to examie whether a majority of americas support legalizig marijuaa. H 0 : p = 0.50 H A : p > 0.50 mea = 0.5, SE = 0.50 0.50 1259 = 0.014 Z = 0.48 0.5 0.014 = 1.43 p value = P(Z > 1.43) = 1 0.0764 = 0.9236 Therefore, we fail to reject the ull hypothesis sice p value > 0.05. There is ot evidece that a majority of americas support legalizig marijuaa. Statistics 10 (Coli Rudel) Lecture 21 April 10, 2012 17 / 17