Study Design: Sample Size Calculation & Power Analysis

Size: px

Start display at page:

Download "Study Design: Sample Size Calculation & Power Analysis"

Emmeline Wells
6 years ago
Views:

1 Study Design: Sample Size Calculation & Power Analysis RCMAR/CHIME/EXPORT April 21, 2008 Honghu Liu, Ph.D.

2 Contents Background Common Designs Examples Computer Software Summary & Discussion

3 Background Power of a statistical test The probability that it will yield statistically significant results Sample size The minimum sample size required to detect a certain difference between parameters Sample size and statistical power are linked with study aims and hypothesis Need to collect sample data to study the hypothesis Hypotheses are tested with certain power

4 Key Concepts (a) Hypothesis Null: H 0 : µ = µ 1 2 Alternative: H a : 2 Background (con( con t) µ ( 2 1 µ µ > µ 1 or µ 1 2 µ < ) (b) Type I error (α ) ---Reject a null hypothesis when it is true: α = Prob( rejecting H 0 H0 is true)

5 Key Concepts (con t) (c) Type II error (β ) --Accept a null hypothesis when it is false: β = Prob( accepting H 0 H0 is false) (d) Power --The probability of rejecting a null hypothesis when it is false: Power= Pr( rejecting H 0 H is false) =1 β 0

6 (e) Effect size Key Concepts (con t) --The difference between parameters to be tested (e.g., = µ µ 1 2). --can be expressed as per standard deviation (e.g., ES= / σ = ( ) / ) ) µ µ σ 1 2 (f) Critical value --The deviate of a distribution that reaches statistical significance under the null hypothesis for a given type I error (e.g., z 1 α =1.645 and z 1 α / 2=1.96)

7 Key Concepts (con t) (g) One-sided vs. Two-sided test --One-sided test: a null hypothesis can only be rejected in one direction (directional test.) (e.g., reject if z > z 1 α ) --Two-sided test: a null hypothesis can be rejected in either direction. (e.g., reject if z > z1 α / 2 )

8 Key Concepts (con t) (h) Acceptance region & rejection region --Acceptance region: the null hypothesis will be accepted for all values that fall into this region (e.g., z <= z<= z 1 α / 2 1 α / 2 ) Rejection region: the null hypothesis will be rejected for all values that fall into this region z < z or z > z (e.g., 1 α / 2 1 α / 2 )

9 One-Sided Test of Normal Distribution with Type I error 0.05 Power Acceptance region Rejection region

10 Five Key Factors 1. Sample size 2. Effect size 3. Significance level 4. Power of the test 5. Variability

11 I. Continuous measure a) One sample normal H 0 : µ 0 n Common Designs µ = : = + β 1 α /2 H µ = µ 1 (( z z )/( / s)) 1 Where = µ µ 1 0 is the difference to be detected; S z is the standard deviation; 1 β and z 1 α / 2 are the normal deviates for desired power and significance level. 1 Note: this is also the sample size for the case of paired observations. 2

12 Common Designs (con t) b) Two sample normal H µ : µ = : H µ µ n (( z + z )/( / s)) 2/(1 1/ r) = β 1 α / 2 n = r*n 1 with 0<r 1, Where 2 is the difference to be detected; S z is the common standard deviation; 1 β and z 1 α / 2 are the normal deviates for desired power and significance level.

13 c) Two group repeated measures (time-averaged means) (Diggle, et al, 1994) H µ : µ =, : H µ µ m = 2( z z ) 2 σ 2{1 ( n 1) ρ} / nd 2 α β = 2( z + ) 2 {1 + ( 1) } / α z n ρ n β Where z α and z β are the normal deviates; σ 2 is the common variance; ρ =Corr( y ij, y ) ik is the intra-patient correlation; d is the difference between the average response of two groups. Note: Unbalanced design: Liu, et al Journal of Modern Applied Statistics; PASS 2008, NCSS.

14 II. Binomial distribution n a) One sample binomial H 0 : p0 p = : H p= p1 1 = [{ z + z * sqrt( p *(1 p )/ p /(1 p ))}/( p p ) 1 β 1 α / * p *(1 p ) 1 1 Where p 0 is the null value of the probability; p 1 is the alternative value of the probability; z 1 β and z1 α / 2 are the normal deviates for desired power and significance level.

15 b) Two sample binomial p = : H 0 : p 1 2 n H p p = p 1 β 1 α / 2 r+ {[ z + z * sqrt( p*(1 p)*(1/ r+ 1)/( p *(1 )/ p *(1 p ))]/( p p )} 2*[ p *(1 p )/ r+ p *(1 )] p n = r*n 1 with 0<r 1 p = ( p )/2 1 + p2 Where 2 p 1 and 2 p are the probability of groups 1 and 2; z 1 β and z 1 α / 2 are the normal deviates for desired power and significance level.

16 c) Two sample binomial repeated measures (Diggle, et al. 1994) H : p p = : (time-averaged proportions) H p p m = [ z {2 pq(1+ ( n 1) ρ)} 1/ 2+ z {(1+ ( n 1) ρ) α ( p + nd q p q )} 1/ 2] 2/ β Where p = ( p + p 2 2 )/ 1 q =1 p ρ =Corr y ij, y ) is the intra-patient correlation ( ik d is the difference between the average response for the two groups.

17 III. Other Designs Matched case-control control McNemar s test Analysis of variance (ANOVA) Correlation coefficient Logistic regression Multiple regression

18 III. Other Designs (con( con t) Survey Design Methods --Stratification --Clustering --Complex Survey (stratification and clustering) Estimate Design Effect: deff = vav ( surveydesdign ) / ( srs ) Sample Size=deff deff*(traditional sample size calculation formula) vav

19 Study A: Examples Study Aim: To study the impact of a new physical therapy on quality of life of patients with chronic back pain Hypothesis: The new physical therapy can significantly improve the quality of life of patients with chronic back pain Step 1 Outcome utcome measure: SF Step 1 SF-12 physical component summary (PCS) score Step 2 Design: two group comparison between treatment and control Step 3-Statistical 3 model: two group comparison with a continuous outcome measure

20 Study A (con t) Step 4 Obtain the required statistics for the statistical model: 1) type I error: ) type II error: 0.15 (power 85%) 3)mean of PCS : 44 4) SD of PCS: 22 5) effect size: 5 (minimally clinical meaningful difference) Step 5 Find a sample size calculation software and plug in the statistics and get the results: n=348 for each arm

21 Study B Study Aim: To study the difference in rate of participation in novel clinical trial of gene therapy/stem cell research between the Innovative Health Research Intervention (IHRI) and the Standard HIV Attention Control (AC) Hypothesis: IHRI has a higher participation rate than AC Step 1: Outcome measure willingness to participate (binary yes/no variable) Step 2: Design randomized two group comparison between IHRI and AC arms Step 3: Statistical Model two group comparison with binomial distribution

22 Study B (con t) Step 4 Obtain the required statistics for the statistical model 1) Type I error: ) Type II error: 0.20 (power 80%) 3) The estimated rate of participation (60% for AC) 4) Sample size capacity: 180 in each arm 5) Effect size: to be determined Step 5 find a sample size and power analyses software and plug in the statistics and get the effect size: 15%.

23 Sample size calculation & power analyses software General purpose statistical software (e.g., STATA, SPSS, SAS, GLIM, Sigmastat and XLISP_STAT) Special purpose statistical software (e.g., EpiInfo) Stand-alone alone sample size & power analysis software (e.g., NCSS-PASS, nquery and SYSTAT Design) Stand-alone alone sample size & power analysis software for specialized applications (PRESISION for survival studies) Software on Internet (e.g., ucla.edu/)

24 Related key factors Min-Max Max rule Summary & Discussion Minimum required sample size for each main hypothesis Maximum sample size among the multiple minimums Practical factors that influence sample size determination Budget/sample limitation Backward estimation

25 Summary & Discussion (con t) Find the necessary and right statistics e.g., mean, SD & ES Get multiple solutions and select the best design

26 References Jacob Cohen (1988). Statistical Power Analysis for the Behavioral Sciences. Lawrence Erlbaum Associates, Publishers. Hillsdale, New Jersey. Diggle, PJ, Liang, KY and Zeger, SL (1996). Analysis of Longitudinal Data. Oxford University Press Inc., New York. R. Barker Bausell and Yu-Fang Li (2002). Power analysis for experimental research. Cambridge University Press. Pass 2008 Power Analysis and Sample Size for Windows. NCSS, Kaysville, Utah. Liu HH and Wu TT. Sample size calculation and power analysis for Time-averaged difference. Journal of Modern Applied Statistical Methods. 2005;4(2): Helena Chmura Kraemer and Sue Thiemann (1987). How many subjects? Sage Publications, London. Liu HH & Wu TT. Sample Size Calculation and Power Analysis of Changes in Mean Response Over Time. Journal of communication in Statistics (in press) Sharon L. Lohr (1999). Sampling: Design and Analysis. Duxbury Press.

27 Questions? mednet.ucla.edu

Sample Size Determination

Sample Size Determination 018 The number of subjects in a clinical study should always be large enough to provide a reliable answer to the question(s addressed. The sample size is usually determined by