Power ad Sample Size: Issues i Study Desig Joh McGready Departmet of Biostatistics, Bloomberg School Lecture Topics Re-visit cocept of statistical power Factors ifluecig power Sample size determiatio whe comparig two meas Sample size determiatio whe comparig two proportios Sectio A Power ad Its Iflueces 1
Example Cosider the followig results from a study doe o 9 wome, all 35 39 years old Sample Data Mea SBP SD of SBP users 8 13.8 15.3 No- 1 17.4 18. Users 4 Example Of particular iterest is whether use is associated with higher blood pressure Statistically speakig we are iterested i testig : H o :µ = µ NO H o : µ -µ NO = 0 H A : µ µ NO H A : µ -µ NO 0 Here µ represets (populatio mea SBP for users, µ NO (populatio mea BP for wome ot usig 5 Example Let s uveil the results of this test from our data: Sample Data Mea SBP SD of SBP users 8 13.8 15.3 No- 1 17.4 18. Users 6
Example The sample mea differece i blood pressures is 13.8 17.4 = 5.4 This could be cosidered scietifically sigificat, however, the result is ot statistically sigificat (or eve close to it! at the α =.05 level Suppose, as a researcher, you were cocered about detectig a populatio differece of this magitude if it truly existed This particular study of 9 wome has low power to detect a differece of such magitude 7 Power Recall, from lecture four of SR1: D E C I S I O N Reject H o Not Reject H o H o Type I Error alpha-level TRUTH H A Power 1-beta Type II Error beta 8 Power Power is a measure of doig the right thig whe H A is true! Higher power is better (the closer the power is to 1.0 or 100% We ca calculate power for a give study if we specify a specific H A This /Blood pressure study has power of.13 to detect a differece i blood pressure of 5.4 or more, if this differece truly exists i the populatio of wome 35-39 years old! Whe power is this low, it is difficult to determie whether there is o statistical differece i populatio meas or we just could ot detect it 9 3
Power Recall, the samplig behavior of x1 x is ormally distributed (large samples with this samplig distributed cetered at true mea differece. If H o truth, the curve is cetered at 0 H o : μ 1 -μ = 0 10 Power Recall, the samplig behavior of x1 x is ormally distributed (large samples with this samplig distributed cetered at true mea differece. If H A truth, the curve is cetered at some value d, d 0 H A : μ 1 -μ = d 11 Power H o will be rejected (for α=.05 if the sample result, x1 x, is more tha stadard errors away from 0, either above or below H o : μ 1 -μ = 0 1 4
Power Ho will be rejected (for α=.05 if the sample result, x1 x, is more tha stadard errors away from 0, either above or below H o : μ 1 -μ = 0 H A : μ 1 -μ = d 13 What Iflueces Power? I order to INCREASE power for a study comparig two group meas, we eed to do the followig: 1. Chage the estimates of µ 1 ad µ so that the differece betwee the two ( µ 1 -µ is bigger H o : μ 1 -μ = 0 H A : μ 1 -μ = d 14 What Iflueces Power? I order to INCREASE power for a study comparig two group meas, we eed to do the followig: 1. Chage the estimates of µ 1 ad µ so that the differece betwee the two ( µ 1 -µ is bigger H o : μ 1 -μ = 0 H A : μ 1 -μ = d 15 5
What Iflueces Power? I order to INCREASE power for a study comparig two group meas, we eed to do the followig:. Icrease the sample size i each group H o : μ 1 -μ = 0 H A : μ 1 -μ = d 16 What Iflueces Power? I order to INCREASE power for a study comparig two group meas, we eed to do the followig:. Icrease the sample size i each group H o : μ 1 -μ = 0 H A : μ 1 -μ = d 17 What Iflueces Power? I order to INCREASE power for a study comparig two group meas, we eed to do the followig: 3. Icrease the α-level of the hypothesis test (fuctioally speakig, make it easier to reject here, with α=.05: H o : μ 1 -μ = 0 H A : μ 1 -μ = d 18 6
What Iflueces Power? I order to INCREASE power for a study comparig two group meas, we eed to do the followig: 3. Icrease the α-level of the hypothesis test (fuctioally speakig, make it easier to reject here, with α=.10: H o : μ 1 -μ = 0 H A : μ 1 -μ = d 19 Power ad Studies Power ca be computed after a study is completed Ca oly be computed for specific H A s: i.e. this study had XX% to detect a differece i populatio meas of YY or greater. Sometimes preseted as a excuse for o statistically sigificat fidig: the lack of a statically sigificat associatio betwee A ad B could be because of low power (< 15% to detect a mea differece of YY or greater betwee.. Ca also be preseted to corroborate with a o statistically sigificat result May times, i study desig, a required sample size is computed to actually achieve a certai preset power level to fid a Cliically/scietifically miimal importat differece i meas I ext sectio we will show how to do this usig Stata Idustry stadard for power: 80% (or greater 0 Sectio B Sample Size Calculatios whe Comparig Group Meas 7
Example Blood pressure ad oral cotraceptives Suppose we used data from the example i Sectio A to motivate the followig questio: Is oral cotraceptive use associated with higher blood pressure amog idividuals betwee the ages of 35 39? Pilot Study Recall, the data: Sample Data Mea SBP SD of SBP users 8 13.8 15.3 No- 1 17.4 18. Users 3 Pilot Study We thik this research has a potetially iterestig associatio We wat to do a bigger study We wat this larger study to have ample power to detect this associatio, should it really exist i the populatio What we wat to do is determie sample sizes eeded to detect about a 5mm icrease i blood pressure i O.C. users with 80% power at sigificace level α =.05 Usig pilot data, we estimate that the stadard deviatios are 15.3 ad 18. i O.C. ad o-o.c. users respectively 4 8
Pilot Study Here we have a desired power i mid ad wat to fid the sample sizes ecessary to achieve a power of 80% to detect a populatio differece i blood pressure of five or more mmhg betwee the two groups 5 Pilot Study We ca fid the ecessary sample size(s of this study if we specify. α-level of test (.05 Specific values for μ 1 ad μ (specific H A ad hece d= μ 1 -μ : usually represets the miimum scietific differece of iterest Estimates of σ 1 ad σ The desired power(.80 6 Pilot Study How ca we specify d= μ 1 -μ ad estimate populatio SDs? Researcher kowledge experiece makes for good educated guesses Make use of pilot study data! 7 9
Pilot Study Fill i blaks from pilot study α -level of test (.05 Specific HA (μ =13.8, μ NO =17.4, ad hece d= μ 1 -μ =5.4 mmhg Estimates of σ ( = 15.3 ad σ NO (=18. The power we desire (.80 8 Pilot Study Give this iformatio, we ca use Stata to do the sample size calculatio sampsi commad Commad sytax (items i italics are umbers to be supplied by researcher sampsi μ 1 μ, alpha(α power(power sd1(σ 1 sd(σ 9 Stata Results Blood Pressure/ example 30 10
Pilot Study/Stata Results Our results from Stata suggest that i order to detect a differece i B.P. of 5.4 uits (if it really exists i the populatio with high (80% certaity, we would eed to eroll 153 O.C. users ad 153 o-users This assumed that we wated equal umbers of wome i each group! 31 Stata Resulss Blood Pressure/ example 3 Pilot Study/Stata Results Suppose we estimate that the prevalece of O.C. use amogst wome 35 39 years of age is 0% We wated this reflected i our group sizes If O.C. users are 0% of the populatio, o- users are 80% There are four times as may o-users as there are users (4:1 ratio 33 11
Pilot Study/Stata Results We ca specify a ratio of group sizes i Stata Agai, usig sampsi commad with ratio optio sampsi μ 1 μ, alpha(α power(power sd1(σ 1 sd(σ ratio( / 1 34 Stata Resulss Blood Pressure/ example 35 Sectio C Sample Size Determiatio for Comparig Two Proportios 1
Power for Comparig Two Proportios Same ideas as with comparig meas 37 Pilot Study We ca fid the ecessary sample size(s of this study if we specify. α-level of test Specific values for p 1 ad p (specific H A ad hece d= p 1 -p : usually represets the miimum scietific differece of iterest The desired power 38 Pilot Study Give this iformatio, we ca use Stata to do the sample size calculatio sampsi commad Commad sytax (items i italics are umbers to be supplied by researcher sampsi p 1 p, alpha(α power(power 39 13
Aother Example Two drugs for treatmet of peptic ulcer compared (Familiari, et al., 1981 The percetage of ulcers healed by pirezepie ( drug A ad trithiozie ( drug B was 77% ad 58% based o 30 ad 31 patiets respectively (p-value =.17, 95% CI for differece i proportios healed ( p DRUG A - pdrug was(-.04,.4 B The power to detect a differece as large as the sample results with samples of size 30 ad 31 respectively is oly 5% Healed Not Healed Total Drug A 3 7 30 Drug B 18 13 31 Notes Available Cotiued 40 Example As a cliicia, you fid the sample results itriguig wat to do a larger study to better quatify the differece i proportios healed Redesig a ew trial, usig aformetioed study results to guestimate populatio characteristics Use p DRUG A =.77 ad p DRUG B =.58 80% power α =.05 Commad i Stata sampsi.77.58, alpha (.05 power (.8 41 Example Commad i Stata sampsi.77.58, alpha (.05 power (.8 4 14
Example What would happe if you chage power to 90%? sampsi.77.58, alpha (.05 power (.9 43 Example Suppose you wated two times as may people o trithiozoe ( Drug B as compared to pirezephie ( Drug A Here, the ratio of sample size for Group to Group 1 is.0 Ca use ratio optio i sampsi commad 44 Example Chagig the ratio sampsi.77.58, alpha (.05 power (.9 ratio( 45 15
Sample Size for Comparig Two Proportios A radomized trial is beig desiged to determie if vitami A supplemetatio ca reduce the risk of breast cacer The study will follow wome betwee the ages of 45 65 for oe year Wome were radomized betwee vitami A ad placebo What sample sizes are recommeded? 46 Breast Cacer/Vitami A Example Desig a study to have 80% power to detect a 50% relative reductio i risk of breast cacer w/vitami A pvita (i.e. RR = =.50 p PLACEBO usig a (two-sided test with sigificace level α-level =.05 To get estimates of proportios of iterest: - usig other studies, the breast cacer rate i the cotrols ca be assumed to be 150/100,000 per year 47 Breast Cacer/Vitami A Example A 50% relative reductio: if p RR = p VITA PLACEBO =.50 the, p =. 50 p VIT A PLACEBO So, for this desired differece i the relative scale: 150 75 p VIT A = 05 0.5 = 100,000 100,000 Notice, that this differece o the absolute scale, p much smaller i magitude: 75 = 0.00075 = 0.075% 100,000 VIT A p PLACEBO, is 48 16
Breast Cacer Sample Size Calculatio i Stata sampsi commad sampsi.00075.0015, alpha(.05 power(.8 49 Breast Cacer Sample Size Calculatio i Stata You would eed about 34,000 idividuals per group Why so may? Differece betwee two hypothesized proportios is very small: =.00075 We would expect about 50 cacer cases amog the cotrols ad 5 cacer cases amog the vitami i A group 150 placebo : 100,000 vitami A : 34,000 = 51 75 34,000 5 100,000 50 Breast Cacer/Vitami A Example Suppose you wat 80% power to detect oly a 0% (relative reductio i risk associated with vitami A A 0% relative reductio: if p RR = p VITA PLACEBO =.80 the p VIT A =. 80 p PLACEBO So, for this desired differece i the relative scale: 150 10 p VIT A = 0.8 = 100,000 100,000 Notice, that this differece o the absolute scale, p much smaller i magitude: 30 = 0.0003 = 0.03% 100,000 VIT A p PLACEBO, is 51 17
Breast Cacer Sample Size Calculatio i Stata sampsi commad sampsi.001.0015, alpha(.05 power(.8 5 Breast Cacer Vitami A Example Revisited You would eed about 4,000 per group! We would expect 360 cacer cases amog the placebo group ad 90 amog vitami A group 53 A Alterative Approach Desig a Loger Study Proposal Five-year follow-up istead of oe year Here: p VITA 5.001 =.006 p PLACEBO 5.0015 =.0075 Need about 48,000 per group Yields about 90 cases amog vitami A ad 360 cases amog placebo Issue Loss to follow-up 54 18
Sectio D Sample Size ad Study Desig Priciples: A Brief Summary Desigig Your Ow Study Whe desigig a study, there is a tradeoff betwee : Power α-level Sample size Miimum detectable differece (specific Ha Idustry stadard 80% power, α =.05 56 Desigig Your Ow Study What if sample size calculatio yields group sizes that are too big (i.e., ca ot afford to do study or are very difficult to recruit subjects for study? Icrease miimum differece of iterest Icrease α-level Decrease desired power 57 19
Desigig Your Ow Study Sample size calculatios are a importat part of study proposal Study fuders wat to kow that the researcher ca detect a relatioship with a high degree of certaity (should it really exist Eve if you aticipate cofoudig factors, these approaches are the best you ca do ad are relatively easy Accoutig for cofouders requires more iformatio ad sample size has to be doe via computer simulatio cosult a statisticia! 58 Desigig Your Ow Study Whe would you calculate the power of a study? Secodary data aalysis Data has already bee collected, sample size is fixed Pilot Study to illustrate that low power may be a cotributig factor to o-sigificat results ad that a larger study may be appropriate 59 Desigig Your Ow Study What is this specific alterative hypothesis? Power or sample size ca oly be calculated for a specific alterative hypothesis Whe comparig two groups this meas estimatig the true populatio meas (proportios for each group 60 0
Desigig Your Ow Study What is this specific alterative hypothesis? Therefore specifyig a differece betwee the two groups This differece is frequetly called miimum detectable differece or effect size, referrig to the miimum detectable differece with scietific iterest 61 Desigig Your Ow Study Where does this specific alterative hypothesis come from? Hopefully, ot the statisticia! As this is geerally a quatity of scietific iterest, it is best estimated by a kowledgeable researcher or pilot study data This is perhaps the most difficult compoet of sample size calculatios, as there is o magic rule or idustry stadard 6 FYI Usig Stata to Compute Power I promised you i part A of this lecture that I would evetually show you how to compute the power to detect differece i a study that has already bee coducted The sampsi commad is still the commad for this we just eed to feed it slightly differet iformatio for it to compute power 63 1
Calculatig Power I order to calculate power for a study comparig two populatio meas, we eed the followig: Sample size for each group Estimated (populatio meas, μ 1 ad μ for each group these values frame a specific alterative hypothesis (usually miimum differece of scietific iterest Estimated (populatio SD s, σ 1 ad σ α-level α of the hypothesis test 64 Calculatig Power The Blood Pressure/Oral Cotraceptive Example Sample Data Mea SBP SD of SBP users 8 13.8 15.3 No- 1 17.4 18. Users 65 Calculatig Power Fill i iformatio below with results from this study Sample size for each group ( = 8, NO = 1 Estimated (populatio meas, µ = 13.8 ad µ NO = 17.4 Estimated (populatio sd s, σ = 15.3 ad σ NO = 18. for each group α-level of the hypothesis test (.05 66
Calculatig Power Usig sampsi i Stata 67 Calculatig Power I order to calculate power for a study comparig two populatio proportios, we eed : Sample size for each group Estimated (populatio proportios, p 1 ad p for each group : these values frame a specific alterative hypothesis (it usually is the miimum differece of scietific iterest α-level of the hypothesis test 68 Calculatig Power Ulcer Drug/Healig Example I this study: pˆ DRUG A 3 =.77 30 pˆ DRUG B 18 =.58 31 69 3
Calculatig Power Fill i iformatio below with results form this study Sample size for each group ( DRUG A = 30, DRUG B = 31 Estimated (populatio proportios, p DRUG A =.77 ad p DRUG B =.58 α-level of the hypothesis test (.05 70 Calculatig Power Usig the sampsi commad sampsi p 1 p, 1( 1 ( alpha(α 71 Sectio E FYI if Iterested 4
Example Cosider the followig results from a study doe o 9 wome, all 35 39 years old Sample Data Mea SBP SD of SBP users 8 13.8 15.3 No- 1 17.4 18. Users 73 Example Suppose we wat to desig a study with 80% power to detect a mea differece of at least 5.4 mmhg betwee the two group will reject at α=.05 if x xno SE( x x NO as such, wat Pr x xno 0.80 ( = 5.4mmHg SE x xno if µ -µ NO H o : μ 1 -μ = 0 H A : μ 1 -μ = d 74 Example Cosider SE( x x NO Usig the estimates from the small study for populatio SDs: SE( x x NO = σ σ + NO NO with = NO = this becomes: SE( x x = σ σ NO 1 + = ( σ + σ NO NO 75 5
Example Suppose we wat to desig a study with 80% power to detect a mea differece of at least 5.4 mmhg betwee the two groups - i.e. x xno Pr = 0.80 if µ -µ NO 5.4mmHg 1 ( σ + σ NO With some alegbra: Pr x x 1 ( σ + σ NO = 0.80 NO if µ -µ NO 5.4mmHg But if µ -µ NO 5.4mmHg, the assumig large, x x NO is a ormally distributed process with mea µ -µ NO ad stadard error SE( x x = σ σ NO 1 + = ( σ + σ NO NO 76 Example Suppose we wat to desig a study with 80% power to detect a mea differece of at least 5.4 mmhg betwee the two groups So Pr x x 1 ( σ + σ NO = 0.80 NO if µ -µ NO 5.4mmHg Becomes: Pr x x 1 ( σ NO + σ NO 1 ( σ + σ 1 ( σ NO + σ ( μ μ NO NO = 0.8 77 Example Suppose we wat to desig a study with 80% power to detect a mea differece of at least 5.4 mmhg betwee the two groups But o a stadard ormal curve, the value that cuts of 80% of the area to its right is 0.84. So we eed to solve: 1 ( σ + σ ( μ μ NO 1 ( σ + σ NO Some more beautiful algebra: NO = 0.84 1 1 ( σ + σ NO ( μ μ NO = 0.84 ( σ + σ NO 78 6
Example Suppose we wat to desig a study with 80% power to detect a mea differece of at least 5.4 mmhg betwee the two groups Some more beautiful algebra: 1 ( + 0.84 (σ + σ NO = (μ μno squarig both sides: 1 (σ ( + 0.84 + NO = σ (μ μ NO ( + 0.84 (σ (μ μ NO + σ NO = 79 Example Suppose we wat to desig a study with 80% power to detect a mea differece of at least 5.4 mmhg betwee the two groups Pluggig o our ifo: ( + 0.84 (15.3 + 18. = (5.4 8.1 565 = 9. 157 80 Example It is also possible to desig a study to estimate a quatity, such as a mea differece or differece i proportios with a desired level of precisio I other word, the ecessary sample sizes ca be estimated to try to get a cofidece iterval of a desired maximum width 81 7
Example Suppose we wated to desig a study with equal sample sizes to estimate the mea differece withi ± 3 mmhg i.e. desig the study to have a specific precisio Now usig the estimates from the small study for populatio SDs: SE ( x x NO σ = with = NO = this becomes: σ + NO NO SE( x x = σ σ NO 1 + = ( σ + σ NO NO 8 Example We wat SE( x x = 3 NO pluggig i pilot data: 1 ( σ + σ = 3 NO 1 (15.3 + 18. = 3 Sample Data Mea SBP SD of SBP users 8 13.8 15.3 No- Users 1 17.4 18. 83 Example Solvig algebraically 1 (15.3 + 18. = 3 1 3 (15.3 + 18. = 1 (15.3 3 18. = + (15.3 + 18. = 3 51 84 8
Example Solvig algebraically 1 (15.3 + 18. = 3 1 3 (15.3 + 18. = 1 (15.3 + 18. 3 = (15.3 + 18. = 3 51 85 9