Sample size and power calculation using R and SAS proc power. Ho Kim GSPH, SNU

Size: px

Start display at page:

Download "Sample size and power calculation using R and SAS proc power. Ho Kim GSPH, SNU"

Gloria Watts
5 years ago
Views:

1 Sample size and power calculation using R and SAS proc power Ho Kim GSPH, SNU

2 Pvalue (1) We want to show that the means of two populations are different! Y 1 a sample mean from the 1st pop Y 2 a sample mean from the 2nd pop If the two means are the same, two sample mean are very similar. If we calculate the differences of the two sample means repeatedly,

3 Pvalue (2) 0 Events with very low probability

4 Pvalue (3) Pvalue = P(observing our data or more extreme events two pop means are the same) Small pvalue : We observe an even that happens with very low probability. Something wrong with the null hypothesis. Reject the null hypothesis and we conclude that the two means are different.

5 A(observed data) > B (research hypothesis) B > A Null hypothesis (B) : No difference (Ho) Alternative hypo(b) : different means (Ha) Type I error = Pr (reject Ho Ho is true) Type II error = Pr (Not reject Ho Ha is true) Power = 1 (Prob. of finding the difference when the diff is real)

6 Example Y ~ Bin(100, p) Ho: p=1/2 vs. Ha: p 1/ 2 Fact: If Ho is true Y p p(1 p) n type I error ~ N(0,1) Y p Rejection p(1 p) Region n Pr(Reject Ho Ho is true) 0.5 = Pr Y p 1/ Pr( Z 1.645) 0.05 < 1.645

7 Example If we observe Y 45/ We cannot reject Ho since test statistics is not located inside of the RR Pvalue = Prob observing the difference of our data or bigger if Ho is true Pr( Z 1)

8 Example Pr( Z 1) > #R code > pnorm(1.645) [1] > pnorm(1) [1] *SAS Code ; data one; one=probnorm(1.645); two=probnorm(1); run; proc print;run;

9 Example To calculate power Y p p(1 p) n ~ N(0,1) p is not 1/2 Power Pr(Reject Ho Ha is true) 0.5 = Pr Y p? To calculate power, we need to know the exact value of p under Ha.

10 Example Accurate Problem: calculate the power when p=1/3 (Ha) Power 1 Y 3 ~ N 1 1 (0,1) (1 ) 3 3 n Pr(Reject Ho Ha is true) 0.5 = Pr Y p 1/

11 Example Power Pr(Reject Ho Ha is true) 0.5 = Pr Y p 1/ = Pr Y p 1/ 3 = Pr Y p 1/ 3 1/ / 3 = Pr Y p 1/ 3 1/ 3 2 / 3 1/ 3 2 / Pr( Z 1.796)

12 Example In general, > p0= (0:50)/ p 0 Power Pr Z p0(1 p0) 100 > power=pnorm((0.418 p0)/sqrt(p0*(1p0)/100)) > plot(p0,power) data one; do p0 = 0 to 0.5 by 0.01; power=probnorm((0.418 p0)/sqrt(p0*(1p0)/100)); output; end; run; proc gplot; plot power*p0; run;

13 Interpretation Ho: p=1/2 vs. Ha: p1/ 2 If our rule is Y p p(1 p) n < If p=0.2 then prob of rejecting Ho (power) is almost 1. If p=0.4, the power is 0.5. (If p>0.4, then our test cannot find the difference very wellhalf and half)

14 Likelihood and maximum likelihood estimator Likelihood =Pr(getting our data Given model) e.g. We observe 30 patients out of 100 people. Model : Y=# of pt out of 100 people Binomial (n=100, p)

15 Likelihood and maximum likelihood estimator pdf =Pr(Y=y Given model)= 100 p y Maximum likelihood estimator =p which maximizes likelihood (1 p) We can show that MLE= 30/100=0.3 How? y 100 y

16 Likelihood and maximum likelihood estimator l log( likelihood) y log p (100 y) log(1 p) l 0 p pˆ y/100 or > curve(30*log(x)+(10030)*log(1x), 0.1, 0.9)

18 > f=function(x) 30*log(x)+(10030)*log(1x) > optimize(f,c(0.1,0.9),maximum=true) $maximum [1] $objective [1]

19 Homework 1 Let s assume that # of people with certain genotype follows binomial distribution with parameter p. We observe 900 people with the genotype out of 10,000. Derive the maximum likelihood and MLE of p.

20 Multiple Comparisons Suppose we have size of H 01 and H 02 is Let H : 0, Pr(do not reject H H is true) 1 H Pr(do not reject H 01 and do no H02 H0 If we want to test overall is , not So type I error has inflated : 0, Pr(do not reject H H is true) then Pr(do not reject H H ) where H H and H (1 ) (1 ) (1 ) k t reject ) k (1 ) (1 ) (.95).95

21 Multiple Comparisons Bonferroni Correction : If we have m comparisons, then set the individual significance =, then overall type I error is close m to ex)m= (1 ) Ex) If we have 10 comparisons, the standard of individual p= This is called Bonferroni corrected pvalue.

22 Terminology (Piantadosi) Phase I trial: a clinical trial designed to measure the distribution, metabolism, exertion, and toxicity of a new drug. DF (dosefinding) Phase II trial: a clinical trial design to test the feasibility of, and level of activity of, a new agent or procedure SE (safety and efficacy)

23 Terminology Phase III trial: a clinical trial designed to estimate the relative efficacy of a treatment against a standard, alternative, or placebo. CTE (comparative treatment efficacy) Phase IV trial: a surveillance clinical trial design to estimate the frequency of uncommon side effects, toxicity, or interactions. ES (expanded safety)

24 Terminology 분위수 : quantile Probability density function 2 N(, ) Z Z

25 Type 1 error (α) Z α Onetail Twotail Type 2 error (β) power Z β

26 Get Motivated 2 Trt A Trt B + n n n 1+ n n n 2+ n +1 n +2 n ++ 2 n n n / n n n n n, v v11 n n / , p /

27 n 100 ij n ij p 100 / , 0.01 Difference of the two proportions are exactly the same. But statistical significances are very different??? Pvalue of the statistical test are heavily dependent on the sample size. Pvalues are very small for studies with very large sample size. Balance between the sample size(cost) and statistical significance(efficacy of the experiment) We want to show something with minimum cost.

28 Sample size calculations Sampling survey purpose : estimation tool : sampling error example : opinion survey Clinical trials purpose : testing tool : Type I and II error example : clinical trial

29 simple random sampling N : pop size, n :sample size ˆ y y / n Var y n i1 i 2 N n n N 1 2 N n 1.96 Var( y) 2 B : n N 1 2 N n D B 2 ( N 1) D 2, / 4 95% 신뢰구간 CI (SE)( 표준오차 )

30 If = 0 or 1, y y i is proportion, Npq n ( N 1) D pq Ex1) N=2000, 95% confidence level B=0.05, n=? >> If there is no prior information, p=q=0.5 D 2 2 B / / n At least 334 people

31 N=40,000,000 95% Confidence level B=0.01, n=? Homework 2 Calculate the values of n with p=0.1~0.5 by Draw n vs. p plot and explain. Calculate the values of n with P=0.3, and B=0.01~0.20. Draw n vs B plot and explain.

32 Type I and II errors H o : No effect: no difference H a : Effect: Two means are different. Type 1 error = P(Effect. No effect) Type 2 error = P(No effect Effect)

33 Comparing two proportions H : P P ( P P ) vs. H : P P a 1 2 검정통계량 Test stat Reject 0 pˆ ˆp 2 2 pq / n p pˆ q p 1 Z=, ( 1 ˆp 2 ) / 2, 1 H if Z Z a P1, P2, Q1, Q2 : 실제 Parameters 관심이있는미지의고정된수 ( 모수 ) pˆ 1, ˆp 2, ˆq 1, ˆq 2 : 모수를 Estimates 추정하기위한추정치

34 P(Reject H H is true) 0 0 pˆ ˆp 2 2 pq / n 1 PZ Za P1 P2 P Z Z Z ~ N(0,1) a power P(Reject H H is true) pˆ ˆp 2 2 pq / n E( pˆ 1 ˆp 2 H a ) d Var( pˆ H ) ( PQ P Q ) / n 0 1 PZ Za P1 P2 d ˆp 2 1 a a

35 pˆ ˆp 2 2 pq / n P( pˆ ˆp 2 ( P P ) Z 2 pq / n ( P P ) d 0) 1 * P Za d pˆ ˆp 2 ( P P ) Z 2 pq / n ( P P ) ( PQ P Q ) / n ( PQ P Q ) / n P d Z 2 pq / n ( P P ) ( PQ 1 1 P2 Q2 ) / n Power p Z d 0 0 Za 2 pq / n ( P1P 2) * ( PQ P Q )/ n n Z 2 pq Z ( PQ P Q ) d 2 Z Z Z 1 Z Z 1 Z

36 The incidence of side effects: Old drug=30%. The researchers want to prove that incidence of new drug is less than 25% and the difference is significant. Sample size=?, 5% loss of followups during 6 months, compliance=90%, type 1 error=5%, power=80% Z 1.645, Z P ( ) /

37 Sample size for comparing two proportions (sample size for each group) upper number : α = 0.05 (onetailed) or α = 0.10 (twotailed) ; β = 0.20 middle number : α = (onetailed) or α = 0.05 (twotailed) ; β = 0.20 lower number : α = (onetailed) or α = 0.01 (twotailed) ; β = 0.20 Smaller of P1 and P2 a Expected difference between P1 and P

39 If we take loss of followups and compliance, sample size for each group is n (1 0.05) 0.90

40 > power.prop.test(power=.80,p1=.30,p2=.25, alternative=c("one.sided")) Twosample comparison of proportions power calculation n = p1 = 0.3 p2 = 0.25 sig.level = 0.05 power = 0.8 alternative = one.sided NOTE: n is number in *each* group

41 proc power; twosamplefreq test=pchi proportiondiff = 0.05 refproportion = 0.25 power=0.80 sides=1 ntotal=. ; run;

42 The POWER Procedure Pearson Chisquare Test for Two Proportions Fixed Scenario Elements Distribution Asymptotic normal Method Normal approximation Number of Sides 1 Reference (Group 1) Proportion 0.25 Proportion Difference 0.05 Nominal Power 0.8 Null Proportion Difference 0 Alpha 0.05 Group 1 Weight 1 Group 2 Weight 1 Computed N Total Actual Power N Total

43 Homework 3 Suppose Expected difference between P1 and P2=0.25, α = 0.05 (twotailed) ; β = Calculate sample sizes for smaller of P1 and P2 using SAS or R. Compare the results with the table. Draw plots of sample size and the smaller values. Explain

44 연속형변수의비교 A clinical trial for new drug of arthritis. Blood Prostaglanding level is the major outcome variable. We expect that the difference between placebo and active groups is 2 ug/dl and the standard deviation is 2 ug/dl. Sample sizes for onetailed test and twotailed test. Power=90%,

45 n c 2( Z ) 2 2 Z ( ) c t 2 A gm / dl(10 0.2), 2.0 gm / dl Z n t 1.645, Z nc ( ) / Effective Size (=E/S)= 1 2

46 Sample size for comparing the means of two groups (sample size for each group) Onetailed α = Twotailed α = β= E/S a

47 E/S=2/2=1, n=17 For two sided test, Z n t ( ) 2.0 nc Table> n=21

48 > power.t.test(delta=2,sd=2,sig.level=0.05,power=0.9) Twosample t test power calculation n = delta = 2 sd = 2 sig.level = 0.05 power = 0.9 alternative = two.sided NOTE: n is number in *each* group > power.t.test(delta=2,sd=2,sig.level=0.05,power=0.9,altern ative=c("one.sided")) Twosample t test power calculation n = delta = 2 sd = 2 sig.level = 0.05 power = 0.9 alternative = one.sided NOTE: n is number in *each* group

49 proc power; twosamplemeans test=diff meandiff = 2 stddev = 2 power=0.90 sides=1 npergroup=. ; run;

50 The POWER Procedure Twosample t Test for Mean Difference Fixed Scenario Elements Distribution Normal Method Exact Number of Sides 1 Mean Difference 2 Standard Deviation 2 Nominal Power 0.9 Null Difference 0 Alpha 0.05 Computed N Per Group Actual N Per Power Group

51 proc power; twosamplemeans test=diff meandiff = 2 stddev = 2 power=0.90 sides=2 npergroup=. ; run;

52 The POWER Procedure Twosample t Test for Mean Difference Fixed Scenario Elements Distribution Normal Method Exact Number of Sides 2 Mean Difference 2 Standard Deviation 2 Nominal Power 0.9 Null Difference 0 Alpha 0.05 Computed N Per Group Actual Power N Per Group

53 Onesample and paired tests > power.t.test(n=10,delta=10, sd=10*sqrt(2),type="paired") Paired t test power calculation n = 10 delta = 10 sd = sig.level = 0.05 power = alternative = two.sided NOTE: n is number of *pairs*, sd is std.dev. of *differences* within pairs

54 > n=10:50 > p=power.t.test(n,delta=10, sd=10*sqrt(2),type="paired")$power > plot(n,p)

55 proc power; onesamplemeans means = 10 stddev = power=. ntotal=10 ; run;

56 The POWER Procedure Onesample t Test for Mean Fixed Scenario Elements Distribution Normal Method Exact Mean 10 Standard Deviation Total Sample Size 10 Number of Sides 2 Null Mean 0 Alpha 0.05 Computed Power Power 0.514

57 proc power; onesamplemeans means = 10 stddev = power=. ntotal=10 ; plot x=n min=10 max=50 ; run; Total Sample Size

58 proc power; onesamplemeans means = stddev = /* 10*sqrt(2) 15*sqrt(2) */ power=. ntotal=10 ; plot x=n min=10 max=50 ; run; Total Sample Size Mean Std Dev

59 References 1)Piantadosi, S. (1997), Clinical Trials, A methodological perspectives, John Wiley & Sons, New York. 2)Machin, D. et al. (1987), Sample size tables for clinical studies, Blackwell Sciences, London. 3)ScheinChung Chow and Jenpei Liu (1998), Design and analysis of clinical trials: Concepts and methodologies, Wiley 4)Schuster, J.J. (1993), Practical handbook of sample size guidelines for clinical trials, CRC Press, Florida. 5) 김호 (2002), 적절한연구대상수의산출, 대한마취과학회지 : 42(1), 110.

Sample Size/Power Calculation by Software/Online Calculators

Sample Size/Power Calculation by Software/Online Calculators May 24, 2018 Li Zhang, Ph.D. li.zhang@ucsf.edu Associate Professor Department of Epidemiology and Biostatistics Division of Hematology and Oncology