Section 5.4: Hypothesis testing for μ

Section 5.4: Hypothesis testing for μ Possible claims or hypotheses: Ball bearings have μ = 1 cm Medicine decreases blood pressure For testing hypotheses, we set up a null (H 0 ) and alternative (H a ) hypothesis. Maybe H 0 μ = 1 cm H a μ 1 1 cm ball bearing diameters Maybe H 0 μ 120 mm H a μ > 120 Blood Pressures when on drug. Maybe H 0 μ 120 mm H a μ < 120 Blood Pressures when on drug. Null hypothesis Hypothesis accepted unless sufficient evidence to the contrary Law Case H 0 Innocent H a Guilty o In a court of law, the defendant is innocent until "proven" guilty. o Burden of proof is to disprove innocent = H 0. For statistical hypothesis tests o The null hypothesis, H 0, is innocent until proven guilty. For example suppose regulations limit factory effluent to 1 ppm Copper H 0 μ 1 ppm H a μ > 1 ppm Burden of proof: to show noncompliance H 0 μ > 1 ppm H a μ 1 ppm Burden of proof: to show compliance Often, in research, the burden of proof is on showing that your research hypothesis has merit, The research hypothesis is most often H a, and H 0 states that research hypothesis doesn t hold. The burden of proof is on you to show that the research hypothesis has merit. Example: New Blood Pressure medicine: μ = change in BP Research hypothesis: Drug reduces BP H 0 μ 0 Drug doesn t help H a μ < 0 Drug helps Example: Suppose we are checking a sand 3 minute timer.

Suppose we will accept the manufacturer's claim unless we have sufficient evidence to the contrary. o H 0 μ = 180 seconds o H a μ 180 To test the claim, we measure the time elapsed n=50 times o y = sample average of n = 50 trials. o See if y is close enough to 180. o How far from 180 should y be before we reject μ = 180? o Possible rule: Reject H 0 μ = 180 if y > 180 + 1.96σ y or y < 180 1.96σ y With this rule o P(reject H 0 μ = 180 μ = 180) = 0.05 o P(reject H 0 H 0 true) = 0.05 o The notation α is used for P(type I error) = α o In this case, α = 0.05. For α = 0.01 using Z table probabilities we reject H 0 if y 180 > 2.58σ y o Say σ = 5 sec n = 50 α = 0.05 o σ y = σ n = 5 50 = 25 50 = 1 2 = 0.707 o 1.96σ y = 1.96(0.707) = 1.4

In general suppose we are testing a two-sided alternative where the H a values are on two sides of the H 0 value. H 0 μ = μ 0 (e. g. μ 0 = 180 H 0 μ = 180) H a μ μ 0 For an α level test, reject H 0 if y μ 0 > z α 2 σ y y μ 0 > z σ α 2 y Z > z α 2 In this chapter, z α 2 means percentile 100(1 α 2) z 0.025 can be found at the bottom of a t-table (section 5.7) z 0.025 = 1.96 o Suppose y = 181.4, we know that σ = 5, and we test at the α=0.05 level. o σ y = σ = 5 = 1.4 n 50 o z = 181.4 180 = 2.53 > 1.96 Reject H 1.4 0 o If α = 0.01 Reject H 0 if z > 2.58. o Since 2.53 < 2.58, we do not reject H 0 at the α = 0.01 level o For α = 0.01 compared to α = 0.05, we are less willing to tolerate rejecting H 0 when H 0 is true (claiming the timer doesn t have advertised mean rejecting H 0 is harder).

o An equivalent approach would be to see if μ = 180 is in the (two-sided) 95% confidence interval. o Here the 95% confidence interval is 181.8 ± 1.96(0.707) 181.8 ± 1.4 180.4 to 183.2 o Reject H 0 since μ = 180 is not included. o The confidence interval gives more info than just Reject H 0. o We know how different from 180 the true mean, μ, might be. Opinion: You are better off using confidence intervals rather than hypothesis tests. 3 cases H 0 μ = μ 0 H a μ μ 0 2-sided H 0 μ μ 0 H a μ < μ 0 1-sided H 0 μ μ 0 H a μ > μ 0 1-sided H a μ μ 0 A 2-sideed alternative Reject H 0 if y is far from μ 0 in either direction Reject H 0 if z > z α 2 y μ 0 > z SE α 2 y

H a μ < μ 0 A 1-sided alternative Reject H 0 if y is far below μ 0 Reject H 0 if z < z α y μ 0 < z α SE y H a μ > μ 0 A 1-sided alternative Reject H 0 if y is far above μ 0 Reject H 0 if z < z α y μ 0 > z α SE y

o α = P(type I error) = P(reject H 0 H 0 is true) (Actually worst case) o H 0 μ μ 0 H a μ < μ 0 o α = max μ μ 0 P(Reject H 0 ) o The H 0 case most likely mistaken for H a is μ = μ 0. o α = P(Reject H 0 μ = μ 0 ) o In many books, H 0 is written with = sign. o H a μ = 0 H a μ < 0 Think again of a law case: H 0 Innoccent Decision Acquit Convict Truth = Correct Type I error (α) Innocent Truth= Guilty Type II error (β) Correct o α = P(type I error) = P(reject H 0 H 0 is true) In this example, α = P(convict innoc) o β = P(type II error) = P(accept H 0 H 0 is false) In this example, β = P(acquit guilty) o 1 β = Power = P(reject H 0 H 0 is false) o β = P(accept H 0 H 0 is false) o β depends on how false H 0 is. Example: Light bulbs with manufacturers' claim of mean lifetime of 1000 hours. o Suppose we accept the manufacturer's claim unless we have sufficient evidence to the contrary o H 0 μ = 1000 hours H a μ 1000 hours o α = P(Reject H 0 μ = 1000 μ = 1000) o β = P(Don t reject H 0 μ = 1000 μ 1000) o β is different for different values of μ 1000

Calculating Power Back to 3 minute timer. H 0 μ = 180 H a μ 180 α = 0.05 o Reject H 0 if: o y > 180 + 1.96 SE Y or y < 180 1.96 SE Y If μ = 181.5, then Power = P(y > 180 + 1.96 SE Y μ = 181.5) + P(y < 180 1.96 SE Y μ = 181.5) = P z > 180+1.96 SE Y 181.5 = P z > 1.5 + 1.96 + P z < 1.5 1.96 SE Y SE Y SE Y Suppose n = 100 σ = 5 SE Y = σ = 5 = 0.5 n 100 Power = P(z > 1.96 3) + P(z < 1.96 3) = P(z > 1.04) + P(z < 4.96) P(z > 1.04) = 0.8502 For 2-sided test following the step above Power P z > z α 2 μ 0 μ α SE Y For μ = 182.5 with 2-sided alternative and α = 0.05 Power P z > 1.96 2.5 0.5 = P(z > 1.96 5) = P(z > 3.04) = 0.9988 The farther the truth is from H 0 μ = μ 0, the greater the power. o For testing µ=180 the power of rejecting H 0 is greater when µ=182.5 than when µ=181.5

Section 5.5: Choosing n Based on Power Deciding n based on power considerations requires α to be specified o α = P(type I error) = P(Reject H 0 H 0 is true ) β to be specified for some particular alternative value of µ o β = P(type II error) = P(reject H 0 some speci ied H a = μ α value) o Example: Sand Timer H 0 = 180 H a 180 Suppose we decide to o Set α = 0.05 o Require power of 0.80 to reject H 0 if μ = 181 Solve for n: (1.96 + 0.84) σ = 181 180 n (1.96+0.84)σ 181 180 (1.96+0.84) 2 5 2 1 2 n = 196 = n

In general going through these steps for a 2-sided alternative n z α 2+z β 2 σ 2 = μ 2 α μ 0 (difference to detect) The solution is only approximate because we ignored a small probability on one side of the normal curve. This is close enough as long as is big enough. In the same way for a one-sided test: n = z α+z β 2 σ 2 2 Here we are not ignoring one side of the normal curve so this is an equlity. The required sample size increases as o σ 2 increases o decreases o α, β error rates decrease o z α, z β increase

Section 5.6: P-values Rather than just saying Reject H 0, we want to say how convincingly H 0 is rejected. Suppose for example H 0 μ = 180 H a μ 180 y = 181.2 n = 100 σ = 5 SE y = σ = 5 = 0.5 n 100 z = y 180 = 181.2 180 = 1.2 = 2.4 SE y 0.5 0.5 The observed mean is 2.4 SE's away from H 0 μ = 180 If H 0 μ = 180 is true, we expect z to be around 0. Evidence against H 0 is values of z far from 0 in either direction. The p-value answers o What is the prob y of a z this far or farther from expected if H 0 is true? o What is the prob y of this much or more evidence against H 0 if H 0 is true? o What is the prob y of this much evidence against defendant if defendant is innocent?

p-value = P(z > 2.4) + P(z < 2.4) = 2(0.0082) = 0.0164 Small p-values are bad for H 0. o If the data result is far from what s expected for H 0, the p-value is small and H 0 Reject H 0 if p-value < α. α tells us how unusual the data have to be in order to give up on H 0. For a p-value = 0.0164 o Since p 0.05, we reject H 0 if α = 0.05 o Since p > 0.01, Do not reject H 0 if α = 0.01 Suppose for Blood Pressure changes H 0 μ 0 H a μ < 0 y = 1.8 n = 10 σ = 3 SE y = 3 10 = 0.95 1.8 0 z = = 1.90 0.95 p-value = P(z < 1.9) = 0.0287 Since p 0.05, reject H 0 if α = 0.05 Since p > 0.01, do not reject H 0 if α = 0.01

In general H 0 μ = μ 0 p-value = P( z > z data ) H a μ μ 0 H 0 μ μ 0 p-value = P(z > z data ) H a μ > μ 0 H 0 μ μ 0 p-value = P(z < z data ) H a μ < μ 0