Hypothesis tests and confidence intervals

Hypothesis tests ad cofidece itervals The 95% cofidece iterval for µ is the set of values, µ 0, such that the ull hypothesis H 0 : µ = µ 0 would ot be rejected by a two-sided test with α = 5%. The 95% CI for µ is the set of plausible values of µ. If a value of µ is plausible, the as a ull hypothesis, it would ot be rejected. For exaple: 9.98 9.87 10.05 10.08 9.99 9.90 assued iid oralµ,. X = 9.98; s = 0.082; = 6 qt0.975,5 = 2.57 95% CI for µ = 9.98 ± 2.57 0.082 / 6 = 9.98 ± 0.086 = 9.89,10.06 1 Saple size calculatios = $ available $ per saple Too little data A total waste Too uch data A partial waste 2

Power X 1,..., X iid oralµ A, A Y 1,..., Y iid oralµ B, B Test H 0 : µ A = µ B vs H a : µ A µ B at α = 0.05. Test statistic: T = X Ȳ ŜD X Ȳ. Critical value: C such that Pr T > C µ A = µ B = α. Power: Pr T > C µ A µ B Power C 0 C 3 Power depeds o... The desig of your experiet What test you re doig Chose sigificace level, α Saple size True differece, µ A µ B Populatio SD s, A ad B. 4

The case of kow populatio SDs Suppose A ad B are kow. The X Ȳ oral µ A µ B, Test statistic: Z = X Ȳ 2 A 2 A If H 0 is true µ A = µ B, Z oral0,1 = C = z α/2 so that Pr Z > C µ A = µ B = α For exaple, for α = 0.05, C = qor0.975 = 1.96. 5 Power whe the populatio SDs are kow If µ A µ B =, the Z = X Ȳ A 2 Pr X Ȳ 2 A > 1.96 = Pr X Ȳ 2 A > 1.96 oral0,1 + Pr X Ȳ 2 A < 1.96 = Pr X Ȳ 2 A > 1.96 2 A + Pr X Ȳ 2 A < 1.96 2 A = Pr Z > 1.96 + Pr Z < 1.96 6

Calculatios i R Power = Pr Z > 1.96 + Pr Z < 1.96 C <- qor0.975 se <- sqrt sigaa^2/ + sigab^2/ power <- 1 - porc - delta/se + por-c - delta/se Power curves Power 100 80 60 40 20 0 2 0 2 =20 =10 =5 7 Power depeds o... Power = Pr Z > C + Pr Z < C Choice of α which affects C Larger α less striget greater power. = µ A µ B = the true effect. Larger greater power. Populatio SDs, A ad B Saller s greater power. Saple sizes, ad Larger, greater power. 8

Choice of saple size We ostly ifluece power via ad. Power is greatest whe 2 A is as sall as possible. Suppose the total saple size N = + is fixed. For exaple: 2 A is iiized whe = If A = B, we should choose =. If A = 2 B, we should choose = 2. A A + B N ad = B A + B N e.g., if A = 4 ad B = 2, we ight use =20 ad =10 9 Calculatig the saple size Suppose we seek 80% power to detect a particular value of µ A µ B =, i the case that A ad B are kow. For coveiece here, let s preted that A = B ad that we pla to have equal saple sizes for the two groups. Power Pr Z > C = Pr Z > 1.96 2 Fid such that Pr Z > 1.96 2 Thus 1.96 2 = qor0.2 = 0.842. = 80%. = = [1.96 0.842] 2 = = 15.7 2 10

Equal but ukow populatio SDs X 1,..., X iid oralµ A, Y 1,..., Y iid oralµ B, Test H 0 : µ A = µ B vs H a : µ A µ B at α = 0.05. ˆ p = s 2 A 1+s2 B 1 + 2 ŜD X Ȳ = ˆ p 1 + 1 Test statistic: T = X Ȳ ŜD X Ȳ. I the case µ A = µ B, T follows a t distributio with + 2 d.f. Critical value: C = qt0.975, +-2 11 Power = Pr Power: equal but ukow pop SDs X Ȳ ˆ p 1 + 1 > C I the case µ A µ B =, the statistic X Ȳ ˆ p 1 + 1 This distributio has two paraeters: degrees of freedo as before the o-cetrality paraeter, follows a o-cetral t distributio. 1 + 1 C <- qt0.975, + - 2 se <- siga * sqrt 1/ + 1/ power <- 1 - ptc, +-2, cp=delta/se + pt-c, +-2, cp=delta/se 12

Power curves 100 80 60 Power 40 20 = 20 = 10 = 5 kow SDs ukow SDs 0 2 0 2 13 A built-i fuctio: power.t.test Calculate power or deterie the saple size for the t-test whe: Saple sizes equal Populatio SDs equal Arguets: = saple size delta = = µ 2 µ 1 sd = = populatio SD sig.level = α = sigificace level power = the power type = type of data two-saple, oe-saple, paired alterative = two-sided or oe-sided test 14

Exaples A. = 10 for each group; effect = = 5; pop SD = = 10 power.t.test=10, delta=5, sd=10 = 18% B. power = 80%; effect = = 5; pop SD = = 10 power.t.testdelta=5, sd=10, power=0.8 = = 63.8 = 64 for each group C. power = 80%; effect = = 5; pop SD = = 10; oe-sided power.t.testdelta=5, sd=10, power=0.8, alterative="oe.sided" = = 50.2 = 51 for each group 15 Ukow ad differet pop SDs X 1,..., X iid oralµ A, A Y 1,..., Y iid oralµ B, B Test H 0 : µ A = µ B vs H a : µ A µ B at α = 0.05. Test statistic: T = X Ȳ s 2 A + s2 B To calculate the critical value for the test, we eed the ull distributio of T that is, the distributio of T if µ A = µ B. To calculate the power, we eed the distributio of T give the value of = µ A µ B. We do t really kow either of these. 16

Power by coputer siulatio Specify,, A, B, ad = µ A µ B, ad the sigificace level, α. Siulate data uder the odel. Perfor the proposed test ad calculate the P-value. Repeat ay ties. Exaple: = 5, = 10, A = 1, B = 2, = 0.0, 0.5, 1.0, 1.5, 2.0 or 2.5. 17 = 0 = 0.5 0.0 0.2 0.4 0.6 0.8 1.0 P value 0.0 0.2 0.4 0.6 0.8 1.0 P value = 1.0 = 1.5 0.0 0.2 0.4 0.6 0.8 1.0 P value 0.0 0.2 0.4 0.6 0.8 1.0 P value = 2.0 = 2.5 0.0 0.2 0.4 0.6 0.8 1.0 P value 0.0 0.2 0.4 0.6 0.8 1.0 P value 18

100 80 60 Power 40 20 0 0.0 0.5 1.0 1.5 2.0 2.5 19 Deteriig saple size The thigs you eed to kow: Structure of the experiet Method for aalysis Chose sigificace level, α usually 5% Desired power usually 80% Variability i the easureets If ecessary, perfor a pilot study, or use data fro prior experiets or publicatios The sallest eaigful effect 20

Reducig saple size Reduce the uber of treatet groups beig copared. Fid a ore precise easureet e.g., average survival tie rather tha proportio dead. Decrease the variability i the easureets. Make subjects ore hoogeous. Use stratificatio. Cotrol for other variables e.g., weight. Average ultiple easureets o each subject. 21 Tests to copare two eas 1. Assue 1 2 a Calculate pooled estiate of populatio SD b ˆ SE = ˆpooled 1 + 1 c Copare to tdf = + 2 I R: t.test with var.equal=true 2. Allow 1 2 a SE ˆ s 2 = 1 + s2 2 b Copare to t with df fro asty forula. I R: t.test with var.equal=false the default 22

Estiated type I error rates X 1,..., X 4 iid oralµ, Y 1,..., Y 4 iid oralµ, τ 10,000 siulatios τ = 1 Allow 1 2 Assue 1 2 FTR H 0 Reject H 0 FTR H 0 0.948 0.000 0.948 Reject H 0 0.009 0.043 0.052 0.957 0.043 τ = 2 Allow 1 2 Assue 1 2 FTR H 0 Reject H 0 FTR H 0 0.940 0.000 0.940 Reject H 0 0.012 0.048 0.060 0.952 0.048 τ = 1.5 Allow 1 2 Assue 1 2 FTR H 0 Reject H 0 FTR H 0 0.944 0.000 0.944 Reject H 0 0.009 0.047 0.056 0.953 0.047 τ = 4 Allow 1 2 Assue 1 2 FTR H 0 Reject H 0 FTR H 0 0.924 0.000 0.924 Reject H 0 0.023 0.054 0.076 0.946 0.054 23 Estiated power X 1,..., X 4 iid oralµ, Y 1,..., Y 4 iid oralµ+2, τ 10,000 siulatios τ = 1 Allow 1 2 Assue 1 2 FTR H 0 Reject H 0 FTR H 0 0.344 0.000 0.344 Reject H 0 0.046 0.611 0.656 0.389 0.611 τ = 2 Allow 1 2 Assue 1 2 FTR H 0 Reject H 0 FTR H 0 0.658 0.000 0.658 Reject H 0 0.060 0.282 0.342 0.718 0.282 τ = 1.5 Allow 1 2 Assue 1 2 FTR H 0 Reject H 0 FTR H 0 0.532 0.000 0.532 Reject H 0 0.057 0.411 0.468 0.589 0.411 τ = 4 Allow 1 2 Assue 1 2 FTR H 0 Reject H 0 FTR H 0 0.836 0.000 0.836 Reject H 0 0.047 0.117 0.164 0.883 0.117 24