Notes on Hypothesis Testing, Type I and Type II Errors

Joatha Hore PA 818 Fall 6 Notes o Hypothesis Testig, Type I ad Type II Errors Part 1. Hypothesis Testig Suppose that a medical firm develops a ew medicie that it claims will lead to a higher mea cure rate. Suppose the old cure rate was µ. The firm claims that the ew mea rate is µ 1 > µ. How ca the firm verify their claim? By usig Hypothesis Testig. Null Hypothesis: The positio that must get the beefit of doubt. This is usually the covetioal wisdom. I this case, H : µ µ Alterative Hypothesis: The claim we seek to prove, ofte called the Research Hypothesis. H a : µ> µ Aim of the Researcher: Fid evidece i favor of H a. So he will reject H if the calculated sample mea is high relative to µ. Importat Step: Assume: µ=µ. This is doe to give the maximum chace for H to be true if we get a high sample mea. Remember that the questio of rejectig the ull oly arises if the sample mea is high. Next: Assumig µ=µ, calculate the probability of observig a sample mea of x or higher. This is because eve after givig the ull so much beefit of doubt, if ( x) is low, chaces are that the ull hypothesis does t hold. Essetially, we are askig what is the probability that we observe the sample mea that we actually see, coditioal o the ull beig true. If this probability is very low, the that is evidece that the ull hypothesis does ot hold. Basically, the probability we wat to compute is: μ x μ [ x H true] = [ x μ = μ ] = p o = This is the p-value. If p <, the reject H. So what is? is a predetermied probability level that specifies the proportio of times we are willig to reject H whe H is true. If we reject H whe H is true, we are makig a error, so = proportio of times we are willig to make this error How high is depeds o how costly (i terms of reputatio, moetary cost, etc.) this error is likely to be If the cost is very high, we should set very low. Typically =.1,.5,.5 or.1. Suppose p >. Should we reject H? NO!! Suppose p =.6 ad =.5. The p =.6 implies that x 6% of the time that H is true. So if we reject H because x, we will be committig a error 6% of the time. But =.5 implies that we are oly willig to make these mistakes 5% of the time, so we should ot reject H.

Alterative Method to Coduct Hypothesis Tests We ca also use critical values to coduct hypothesis tests. Here, we calculate a critical value, c, such that [ c μ = μ ] c = μ + z = μ c μ c μ = The, if c, we ca reject H. The iterpretatio of c is that eve though H is true, by rejectig H we are committig a error less tha proportio of times. This is withi tolerable levels. EAMPLE 1. A store wats to istall a ew billig system that will be cost effective oly if your mothly accout exceeds $17. I a sample of 4 accouts, the sample mea mothly accout was $178. Suppose = $65 ad =.5. Should the ew system be istalled? H : µ 17 H a : µ> 17 First, work out the p-value: [ > 178 H true] = [ > 178 μ = 17] = [ Z >.46] =.69 <. 5 = z 178 17 μ = > 65 4 Sice the p-value is lower tha, we ca reject the ull hypothesis ad istall the ew system. The iterpretatio is as follows. Give µ 17, the probability of observig a sample mea of 178 is.69%. So it is reasoable to coclude that µ>17. This coclusio may be wrog, but it will oly be wrog less tha 1% of the time. Hece there is little risk i rejectig H. Alteratively, we could use the critical value approach to compute the rejectio regio. So we eed to fid c such that: 17 μ c [ c μ = 17] = =.5 =.5 65 4 c 17 = z.5 = 1.645 65 4 65 c = 17 + 1.645 = 175.35 4 The, sice =178>175.35, we ca reject H.

What we just did was a Right-Tailed Test (H : µ µ ). A Left-Tailed Test is very similar. The, H : µ µ H a : µ< µ Ad so we reject H if the sample mea is too LOW. The costructio of a p-value is very similar to before: μ x μ [ < x H otrue] = [ < x μ = μ ] = < = p Ad we reject H if p <. The oly differece i a Left-Tailed Test is that we wat [ x H otrue] [ x H true] < istead of o i the Right-Tailed Test case (because low values of the sample mea provide evidece agaist the ull). We ca also use the critical value approach, where the critical value is c = μ z (istead of c = μ + z i the Right-Tailed Test). EAMPLE. Say that H : µ ad H a : µ<. Suppose that = 6, =.1, = ad 1.63 μ [ < 1.63 μ = ] = < = [ Z <.9] =. 1788 6 Sice p >, we caot reject the ull hypothesis that µ. We ca also use the critical value approach to costruct the rejectio regio. 6 c = μ z = 1.645 = 1.3346 The, sice =1.63 > 1.3346, we caot reject the ull. =1.63. The the p-value is: These were both Oe-Tailed Tests. Fially, we cosider a Two-Tailed Test. H : µ = µ H a : µ µ I the case of a Two-Tailed Test, we reject the ull hypothesis if the sample mea turs out to be too big OR too small. The calculatio of the p-value is slightly differet to a Oe-Tailed Test. Now, we wat to calculate μ x x x μ μ μ μ < + > = Z > = p The, reject H if p <. We ca also use critical values to derive the rejectio regio, as before. Now however, there must be two critical values: c1 = μ + z c = μ z So we reject the ull if:

μ z μ + z μ z, EAMPLE 3. H : µ = 17.9, H a : µ 17.9. Suppose that = 3.87, =.5, = 1 ad =17.55. So let s first compute the p-value: x μ 17.55 17.9 So, p = Z > = Z > = [ Z > 1.19] = (.117) =. 34 3.87 1 Sice p =.34 >.5 =, we caot reject the ull. Alteratively, we ca fid the rejectio regio usig critical values: 3.87 c1 = μ + z = 17.9 + 1.96 = 17.9 +.7585 = 17.8485 1 c = μ z = 17.9 1.96 3.87 1 The sice = 17.55 [ 16.33148,17.8485] = 17.9.7585 = 16.33148, we caot reject the ull. Part. Type I ad Type II Errors H True H False Accept H OK Type II Error Reject H Type I Error OK Type I errors are very serious you do ot wat to make wild claims ad be prove wrog later. i our previous discussio is the probability of makig a Type I error. Hypothesis tests are desiged to make low, sice we choose a low value for! Type II errors occur the the ull hypothesis is false but we fail to reject it. These errors are ot as serious, but we would like to avoid them. There is a trade-off betwee Type I ad Type II errors. If we do ot chage the sample size, the oly way to reduce Type II errors is by icreasig the probability of makig Type I errors. However, both ca be reduced by icreasig. Type II errors are always computed for a give. Thik of the probability of NOT makig a Type II error. To compute this, we eed to specify a value for µ, µ a, which we (the researcher) believe is true istead of µ. The the probability of ot committig a Type II error is: p(µ a ) = [Reject H at sigificace H False] = [ falls i the rejectio regio µ = µ a ] So we have that: [Type II Error] = 1 - p(µ a ) p(µ a ) = Power of the test give µ = µ a.

EAMPLE 4. Suppose the old maufacturig process produces 8 uits per hour. We wat to evaluate the claim that a ew maufacturig process produces 85 uits per hour. Let =.5, = ad = 5. Hece, H : µ 8 H a : µ> 8 So first, we compute the rejectio regio. μ [ c μ = 8] = =.5 c 8 = z.5 = 1.645 5 c = 8 + 1.645 84 5 So we reject the ull if 84. c 8 5 =.5 Next, fid the probability of a Type II error give µ a = 85. p(85) = obability ot committig a Type II error 5 μ 84 85 = [ 5 84 μ = 85] = =. 6595 5 So the power of this test is.6595, so the probability of a Type II error is 1.6595 =.345. Hece, eve if the claim is correct, the statistical test will fail to show it 34% of the time. We ca reduce the probability of a Type II error by icreasig. Now, suppose = 1. The rejectio regio is: 8 μ c [ c μ = 8] = =.5 =.5 1 c 8 = z.5 = 1.645 c = 8 + 1.645 8.83 1 1 1 μ 8.83 85 So the power is: p(85) = [ 1 8.83 μ = 85] = =. 896 1 So the probability of a Type II error is 1 p(85) = 1.896 =.138. Hece, doublig the sample size reduced the probability of a Type II error from 34% to 1%.