Data Aalysis ad Statistical Methods Statistics 651 http://www.stat.tamu.edu/~suhasii/teachig.html Suhasii Subba Rao Review of testig: Example The admistrator of a ursig home wats to do a time ad motio study of staff time spet per day performig o-emergecy work. Prior to the itroductio of some efficiecy measures the mea perso hours per day spet o these tasks was 8. The admistrator wats to test whether these measures have reduced average time. She collects a radom sample of 30 staff ad moitors their time. She fids the average perso hours (based o this sample = 30) spet is 7, test the hypothesis the mea time spet o performig o-emergecy work has decreased after the itroductio of efficiecy measures. Assume the stadard deviatio σ = 4 ad do the test at α = 10% level. Costruct a 95% CI for the daily mea time spet performig o-emergecy tasks. Solutio: Do it. 1 Review of previous lecture: calculatig power Two-sided tests: H 0 : µ = µ 0 agaist H A : µ µ 0, we do the test at the α level (usually this is α = 5%), ad we are iterested i the alterative µ 1, the level ad we are iterested i the alterative µ 1, the β P Z z α µ 0 µ 1. β P Z z α/2 µ 0 µ 1. The power = 1 - β (i the case that α = 5% the z α = 1.64). The closer the power is the 100% the better. ad the power is 1 β. I the case that α = 5%, the z α/2 = 1.96. Oe-sided tests: H 0 : µ µ 0 or H A : µ > µ 0 we do the test at the α 2 3
Tests ad very large sample sizes If we are testig the hypothesis H 0 : µ = 2 agaist H A : µ 2 ad the sample size is extremely large (say 3000 observatios), the we are likely to reject the ull, uless the mea is exactly µ = 2. This is because for very large sample sizes a test ca detect ay slight deviatios from the ull mea. This ca be a good or a bad thig. If the deviatio from the ull is so small, we might ot wat to reject the ull, i such a case, costructig a cofidece iterval may give us a more accurate idea about the ature of the populatio mea. Illustratio: large sample sizes ad the HIV vaccie Take a look at hiv vaccie.tex. I this study, there were 16000 voluteers. Amog the 8,000 who received the vaccie, 51 became ifected with the virus. Amog the 8,000 who had a placebo, 74 became ifected with the virus. The differece betwee the proportios is 0.0031 (ie. the differece i proportios of those who did ot take the vaccie ad developed HIV ad those who took the vaccie ad developed HIV is 74/8000 51/8000 = 0.002875). This is very small. However, the stadard error for this example is 0.00139 (we show why this is the case later o the case). The stadard error is quite a lot smaller 4 5 tha the differece (see that the ratio is 0.002875/0.00139 = 2.06). This suggests that the differece is sigificat, though small. If you calculate the p-value you should get about 0.013 (1.3%), which is less tha 5% (the rejectio level). Hece based o these umbers, the vaccie seems to have a small protective effect. Example 1 A city officer audits the parkig tickets issued by city parkig officers. I the past years the mea umber of improperly issued tickets has bee 380 ad stadard deviatio σ = 35.2. Because there has bee a chage i the city regulatios the city maager suspects that the mea umber of improperly issued tickets has icreased. Suppose she does a audit o 50 radomly selected officers ad she cosiders that if the mea umber of tickets greater tha 390 is uacceptably large. Evaluate the power of the test. Use α = 0.05. See solutios lecture17.pdf, pages 1-2, for the solutio. 6 7
Example 2 Prospective salespeople are ow beig offered a sales traiig program. Previous data idicate that without traiig the average sales perso sold 33 items. The compay wats to access the impact the sales traiig program has o sales. Oe moth after traiig started, 35 people had traiig ad it was foud that o average 35 books were beig sold, the stadard deviatio is 8.4. Has the traiig ehace sales? Use α = 0.05. The compay decides that the traiig is oly cost effective if o average 38 or more books are sold. Evaluate the power for the test you used above. See solutios lecture17.pdf, pages 3-5, for the solutio. Solutio 2 (a) We wat to see whether mea sales have improved after some traiig, so out ull is H 0 : µ 33 ad H A : µ > 33. We calculate the p-value, supposig that X N(33,8.4 2 /35). We calculate the probability of observig the average 35 ad over give that that true mea is 33: P( X 35) = P ( ) X 33 8.4/ 35 2 8.4/ = P(Z 1.408) = 0.079. 35 Sice we are doig a oe sided test with α = 0.05 ad 0.079 > 0.05, there is ot eough evidece to reject the ull, hece ot eough evidece to say traiig has ehaced sales. (b) We are really iterestig i seeig whether our statistical test is able to reject the ull whe the true mea is 38. As we wat to be able to 8 9 detect the chage i sales at this poit. We use the formula, the type I error is: β = P P Z z α µ 0 µ 1 Z 1.64 33 38 8.4 2 35 = P(Z 1.88) = 0.0301. Hece the power is 100(1 0.0301) = 97% - this is very large. Which meas with a large probability we are able to reject the ull whe the mea is µ = 38. Hece if we wat to detect whe the mea is 38, we have a large chace of detectig a chage. Sample size ad power We leart i the previous lecture that there were three major compoets that exerted ad ifluece o power. These are How close the alterative is to the ull. The further the alterative µ 1 from µ 0 the larger the power (which is good!). The variace of the populatio. The sample size (somethig that we ca cotrol). The larger the sample size the better (larger) the power. Usually, we have a alterative i mid (recall the examples above). Ad we also have a power value (say 80 or 90 percet) that we wat to aim for. Therefore before coductig a experimet we wat to determie the sample size i order to gai a certai amout of power. 10 11
Choosig the sample size for the alterative µ 1 Calculatig what sample size should be to get a certai power level is rather like calculatig the size of so that a cofidece iterval has a certai legth. Suppose we do the test H 0 : µ = µ 0 agaist H A : µ = µ 1 at the α% level (say 5%). We wat to be able to reject the ull whe µ = µ 1 with 100(1 β)% (say 90%). Hece we wat the power of the test to be 100(1 β)% (for example 90%). We wat to fid the sample size which will give us the power (1 β)100%. Recipe for choosig : We set β = 1- power/100 (example β = 0.10) - β is the type II error ad the umber we will be usig i the calculatio. The we look iside the Normal tables for the probability β (example look iside table for 0.1) ad read out to fid z β. So z β will be the value o the x-axis where the area is β (for example 10%). (P(Z z β ) = β). For the two-sided test choose which satisfies the equatio z α/2 µ 0 µ 1 = z β. 12 13 Solvig this we get: = µ 0 µ 1 2( z α/2 + z β ) 2. Notice that the power depeds o how we choose µ 1. The smaller µ 1 the larger should be to get the same power. I the case of a oe sided test everythig is the same but use the formula = istead. Therefore: µ 0 µ 1 2( z α + z β ) 2 If we choose as the threshold µ 1. The we eed at least observatios to get obtai the power 100(1 β) i the test. 14 15
Example 3 Solutio 3 Returig to the traiig ad sellig items example. Supose we wat the probability of rejectig the ull whe the mea is µ = 34 to be at least 90%, how large a sample should I use? Useful iformatio for to obtai sample size: H 0 : µ 33 agaist H A : µ 34. σ = 8.4, α = 0.05, Power = 90% hece type II error β = 1 0.9 = 0.1. Recall that we use a oe sided test, hece the sample size formula is = µ 0 µ 1 2( z α + z β ) 2. We ote that z α = 1.64, z β = 1.18 (look up i tables) ad = 8.4 2. This all leads to = 8.4 2 33 34 2( 1.64 + 1.28 )2 = 254. Hece we eed a sample size of at least 254 to be able reject the ull whe the mea is 34 with probability 90%. 16 17 Example 4 Solutio 4 Over the past 20 years there has bee plety of debate o whether school examiatios stadards are fallig or ot. It is kow that the mea mark i a exam 20 years ago was 65% ad the stadard deviatio is 4. This year a sample of marks from 9 high school studets was take. The sample mea was foud to be 69%. Suppose µ is the mea mark of studets takig exams this years. (a) Test hypothesis that the mea grade has icreased. Remember to state the ull that you use ad ay assumptios you use o the distributio of the sample mea. Use α = 0.1. (b) The threshold betwee hoours ad distictio is 70%. Compute the power for the alterative µ A 70%. (c) How large should be be such that the power whe the alterative is µ = 66 is 95%? (a) The hypothesis H 0 : µ 65 agaist H A : µ > 65. We assume that the sample mea has a ormal distributio with N(µ,4 2 /9). Draw the desity uder the ull ad idicate where the rejectio regio should lie (you do t have to give the exact values). To obtai the p-value stadardise (uder the ull) z = 69 65 42 /9 = 3, ad evaluate P( X 69 N(65,4 2 /9)) = P(Z > 3) = 0.0013. (Remember we calculate P( X 69 N(65,4 2 /9)) because we require the regio which does ot cotai the mea uder the ull). Sice 0.0013 0.1 there is evidece to reject the ull. 18 19
(b) This is a power calculatio. Use the oe-sided formula. We eed to look up 0.1 i the tables - remember 0.1 is a probability so we eed to look iside the tables, we fid that P(Z 1.28) = 0.1, so z 0.1 = 1.28. Now plug all these umbers ito the formula: β = P Z z α µ 0 µ 1 = P(Z 1.28 65 70 5 ) = P(Z 1.28 ) = P(Z 2.47) = 0.00 42 /9 4/3 we see that z 0.05 = 1.64. Nowe we use the oe-sided equatio: = = µ 0 µ 1 2( z α + z β ) 2 4 2 65 66 2 (1.26 + 1.64)2 135. Hece we eed at to sample at least 135 schools to detect a chage with 90% whe the chage is 66%. the power is 1 0.0068 = 0.9932, this is a very large power! (c) Use the oe-sided equatio. Sice the power we wat is 95%, β = 0.05 ad we Look up z 0.05 i the tables. Lookig iside the table goig out 20 21