Sample size and power calculation using R and SAS proc power. Ho Kim GSPH, SNU

Size: px
Start display at page:

Download "Sample size and power calculation using R and SAS proc power. Ho Kim GSPH, SNU"

Transcription

1 Sample size and power calculation using R and SAS proc power Ho Kim GSPH, SNU

2 Pvalue (1) We want to show that the means of two populations are different! Y 1 a sample mean from the 1st pop Y 2 a sample mean from the 2nd pop If the two means are the same, two sample mean are very similar. If we calculate the differences of the two sample means repeatedly,

3 Pvalue (2) 0 Events with very low probability

4 Pvalue (3) Pvalue = P(observing our data or more extreme events two pop means are the same) Small pvalue : We observe an even that happens with very low probability. Something wrong with the null hypothesis. Reject the null hypothesis and we conclude that the two means are different.

5 A(observed data) > B (research hypothesis) B > A Null hypothesis (B) : No difference (Ho) Alternative hypo(b) : different means (Ha) Type I error = Pr (reject Ho Ho is true) Type II error = Pr (Not reject Ho Ha is true) Power = 1 (Prob. of finding the difference when the diff is real)

6 Example Y ~ Bin(100, p) Ho: p=1/2 vs. Ha: p 1/ 2 Fact: If Ho is true Y p p(1 p) n type I error ~ N(0,1) Y p Rejection p(1 p) Region n Pr(Reject Ho Ho is true) 0.5 = Pr Y p 1/ Pr( Z 1.645) 0.05 < 1.645

7 Example If we observe Y 45/ We cannot reject Ho since test statistics is not located inside of the RR Pvalue = Prob observing the difference of our data or bigger if Ho is true Pr( Z 1)

8 Example Pr( Z 1) > #R code > pnorm(1.645) [1] > pnorm(1) [1] *SAS Code ; data one; one=probnorm(1.645); two=probnorm(1); run; proc print;run;

9 Example To calculate power Y p p(1 p) n ~ N(0,1) p is not 1/2 Power Pr(Reject Ho Ha is true) 0.5 = Pr Y p? To calculate power, we need to know the exact value of p under Ha.

10 Example Accurate Problem: calculate the power when p=1/3 (Ha) Power 1 Y 3 ~ N 1 1 (0,1) (1 ) 3 3 n Pr(Reject Ho Ha is true) 0.5 = Pr Y p 1/

11 Example Power Pr(Reject Ho Ha is true) 0.5 = Pr Y p 1/ = Pr Y p 1/ 3 = Pr Y p 1/ 3 1/ / 3 = Pr Y p 1/ 3 1/ 3 2 / 3 1/ 3 2 / Pr( Z 1.796)

12 Example In general, > p0= (0:50)/ p 0 Power Pr Z p0(1 p0) 100 > power=pnorm((0.418 p0)/sqrt(p0*(1p0)/100)) > plot(p0,power) data one; do p0 = 0 to 0.5 by 0.01; power=probnorm((0.418 p0)/sqrt(p0*(1p0)/100)); output; end; run; proc gplot; plot power*p0; run;

13 Interpretation Ho: p=1/2 vs. Ha: p1/ 2 If our rule is Y p p(1 p) n < If p=0.2 then prob of rejecting Ho (power) is almost 1. If p=0.4, the power is 0.5. (If p>0.4, then our test cannot find the difference very wellhalf and half)

14 Likelihood and maximum likelihood estimator Likelihood =Pr(getting our data Given model) e.g. We observe 30 patients out of 100 people. Model : Y=# of pt out of 100 people Binomial (n=100, p)

15 Likelihood and maximum likelihood estimator pdf =Pr(Y=y Given model)= 100 p y Maximum likelihood estimator =p which maximizes likelihood (1 p) We can show that MLE= 30/100=0.3 How? y 100 y

16 Likelihood and maximum likelihood estimator l log( likelihood) y log p (100 y) log(1 p) l 0 p pˆ y/100 or > curve(30*log(x)+(10030)*log(1x), 0.1, 0.9)

17

18 > f=function(x) 30*log(x)+(10030)*log(1x) > optimize(f,c(0.1,0.9),maximum=true) $maximum [1] $objective [1]

19 Homework 1 Let s assume that # of people with certain genotype follows binomial distribution with parameter p. We observe 900 people with the genotype out of 10,000. Derive the maximum likelihood and MLE of p.

20 Multiple Comparisons Suppose we have size of H 01 and H 02 is Let H : 0, Pr(do not reject H H is true) 1 H Pr(do not reject H 01 and do no H02 H0 If we want to test overall is , not So type I error has inflated : 0, Pr(do not reject H H is true) then Pr(do not reject H H ) where H H and H (1 ) (1 ) (1 ) k t reject ) k (1 ) (1 ) (.95).95

21 Multiple Comparisons Bonferroni Correction : If we have m comparisons, then set the individual significance =, then overall type I error is close m to ex)m= (1 ) Ex) If we have 10 comparisons, the standard of individual p= This is called Bonferroni corrected pvalue.

22 Terminology (Piantadosi) Phase I trial: a clinical trial designed to measure the distribution, metabolism, exertion, and toxicity of a new drug. DF (dosefinding) Phase II trial: a clinical trial design to test the feasibility of, and level of activity of, a new agent or procedure SE (safety and efficacy)

23 Terminology Phase III trial: a clinical trial designed to estimate the relative efficacy of a treatment against a standard, alternative, or placebo. CTE (comparative treatment efficacy) Phase IV trial: a surveillance clinical trial design to estimate the frequency of uncommon side effects, toxicity, or interactions. ES (expanded safety)

24 Terminology 분위수 : quantile Probability density function 2 N(, ) Z Z

25 Type 1 error (α) Z α Onetail Twotail Type 2 error (β) power Z β

26 Get Motivated 2 Trt A Trt B + n n n 1+ n n n 2+ n +1 n +2 n ++ 2 n n n / n n n n n, v v11 n n / , p /

27 n 100 ij n ij p 100 / , 0.01 Difference of the two proportions are exactly the same. But statistical significances are very different??? Pvalue of the statistical test are heavily dependent on the sample size. Pvalues are very small for studies with very large sample size. Balance between the sample size(cost) and statistical significance(efficacy of the experiment) We want to show something with minimum cost.

28 Sample size calculations Sampling survey purpose : estimation tool : sampling error example : opinion survey Clinical trials purpose : testing tool : Type I and II error example : clinical trial

29 simple random sampling N : pop size, n :sample size ˆ y y / n Var y n i1 i 2 N n n N 1 2 N n 1.96 Var( y) 2 B : n N 1 2 N n D B 2 ( N 1) D 2, / 4 95% 신뢰구간 CI (SE)( 표준오차 )

30 If = 0 or 1, y y i is proportion, Npq n ( N 1) D pq Ex1) N=2000, 95% confidence level B=0.05, n=? >> If there is no prior information, p=q=0.5 D 2 2 B / / n At least 334 people

31 N=40,000,000 95% Confidence level B=0.01, n=? Homework 2 Calculate the values of n with p=0.1~0.5 by Draw n vs. p plot and explain. Calculate the values of n with P=0.3, and B=0.01~0.20. Draw n vs B plot and explain.

32 Type I and II errors H o : No effect: no difference H a : Effect: Two means are different. Type 1 error = P(Effect. No effect) Type 2 error = P(No effect Effect)

33 Comparing two proportions H : P P ( P P ) vs. H : P P a 1 2 검정통계량 Test stat Reject 0 pˆ ˆp 2 2 pq / n p pˆ q p 1 Z=, ( 1 ˆp 2 ) / 2, 1 H if Z Z a P1, P2, Q1, Q2 : 실제 Parameters 관심이있는미지의고정된수 ( 모수 ) pˆ 1, ˆp 2, ˆq 1, ˆq 2 : 모수를 Estimates 추정하기위한추정치

34 P(Reject H H is true) 0 0 pˆ ˆp 2 2 pq / n 1 PZ Za P1 P2 P Z Z Z ~ N(0,1) a power P(Reject H H is true) pˆ ˆp 2 2 pq / n E( pˆ 1 ˆp 2 H a ) d Var( pˆ H ) ( PQ P Q ) / n 0 1 PZ Za P1 P2 d ˆp 2 1 a a

35 pˆ ˆp 2 2 pq / n P( pˆ ˆp 2 ( P P ) Z 2 pq / n ( P P ) d 0) 1 * P Za d pˆ ˆp 2 ( P P ) Z 2 pq / n ( P P ) ( PQ P Q ) / n ( PQ P Q ) / n P d Z 2 pq / n ( P P ) ( PQ 1 1 P2 Q2 ) / n Power p Z d 0 0 Za 2 pq / n ( P1P 2) * ( PQ P Q )/ n n Z 2 pq Z ( PQ P Q ) d 2 Z Z Z 1 Z Z 1 Z

36 The incidence of side effects: Old drug=30%. The researchers want to prove that incidence of new drug is less than 25% and the difference is significant. Sample size=?, 5% loss of followups during 6 months, compliance=90%, type 1 error=5%, power=80% Z 1.645, Z P ( ) /

37 Sample size for comparing two proportions (sample size for each group) upper number : α = 0.05 (onetailed) or α = 0.10 (twotailed) ; β = 0.20 middle number : α = (onetailed) or α = 0.05 (twotailed) ; β = 0.20 lower number : α = (onetailed) or α = 0.01 (twotailed) ; β = 0.20 Smaller of P1 and P2 a Expected difference between P1 and P

38

39 If we take loss of followups and compliance, sample size for each group is n (1 0.05) 0.90

40 > power.prop.test(power=.80,p1=.30,p2=.25, alternative=c("one.sided")) Twosample comparison of proportions power calculation n = p1 = 0.3 p2 = 0.25 sig.level = 0.05 power = 0.8 alternative = one.sided NOTE: n is number in *each* group

41 proc power; twosamplefreq test=pchi proportiondiff = 0.05 refproportion = 0.25 power=0.80 sides=1 ntotal=. ; run;

42 The POWER Procedure Pearson Chisquare Test for Two Proportions Fixed Scenario Elements Distribution Asymptotic normal Method Normal approximation Number of Sides 1 Reference (Group 1) Proportion 0.25 Proportion Difference 0.05 Nominal Power 0.8 Null Proportion Difference 0 Alpha 0.05 Group 1 Weight 1 Group 2 Weight 1 Computed N Total Actual Power N Total

43 Homework 3 Suppose Expected difference between P1 and P2=0.25, α = 0.05 (twotailed) ; β = Calculate sample sizes for smaller of P1 and P2 using SAS or R. Compare the results with the table. Draw plots of sample size and the smaller values. Explain

44 연속형변수의비교 A clinical trial for new drug of arthritis. Blood Prostaglanding level is the major outcome variable. We expect that the difference between placebo and active groups is 2 ug/dl and the standard deviation is 2 ug/dl. Sample sizes for onetailed test and twotailed test. Power=90%,

45 n c 2( Z ) 2 2 Z ( ) c t 2 A gm / dl(10 0.2), 2.0 gm / dl Z n t 1.645, Z nc ( ) / Effective Size (=E/S)= 1 2

46 Sample size for comparing the means of two groups (sample size for each group) Onetailed α = Twotailed α = β= E/S a

47 E/S=2/2=1, n=17 For two sided test, Z n t ( ) 2.0 nc Table> n=21

48 > power.t.test(delta=2,sd=2,sig.level=0.05,power=0.9) Twosample t test power calculation n = delta = 2 sd = 2 sig.level = 0.05 power = 0.9 alternative = two.sided NOTE: n is number in *each* group > power.t.test(delta=2,sd=2,sig.level=0.05,power=0.9,altern ative=c("one.sided")) Twosample t test power calculation n = delta = 2 sd = 2 sig.level = 0.05 power = 0.9 alternative = one.sided NOTE: n is number in *each* group

49 proc power; twosamplemeans test=diff meandiff = 2 stddev = 2 power=0.90 sides=1 npergroup=. ; run;

50 The POWER Procedure Twosample t Test for Mean Difference Fixed Scenario Elements Distribution Normal Method Exact Number of Sides 1 Mean Difference 2 Standard Deviation 2 Nominal Power 0.9 Null Difference 0 Alpha 0.05 Computed N Per Group Actual N Per Power Group

51 proc power; twosamplemeans test=diff meandiff = 2 stddev = 2 power=0.90 sides=2 npergroup=. ; run;

52 The POWER Procedure Twosample t Test for Mean Difference Fixed Scenario Elements Distribution Normal Method Exact Number of Sides 2 Mean Difference 2 Standard Deviation 2 Nominal Power 0.9 Null Difference 0 Alpha 0.05 Computed N Per Group Actual Power N Per Group

53 Onesample and paired tests > power.t.test(n=10,delta=10, sd=10*sqrt(2),type="paired") Paired t test power calculation n = 10 delta = 10 sd = sig.level = 0.05 power = alternative = two.sided NOTE: n is number of *pairs*, sd is std.dev. of *differences* within pairs

54 > n=10:50 > p=power.t.test(n,delta=10, sd=10*sqrt(2),type="paired")$power > plot(n,p)

55 proc power; onesamplemeans means = 10 stddev = power=. ntotal=10 ; run;

56 The POWER Procedure Onesample t Test for Mean Fixed Scenario Elements Distribution Normal Method Exact Mean 10 Standard Deviation Total Sample Size 10 Number of Sides 2 Null Mean 0 Alpha 0.05 Computed Power Power 0.514

57 proc power; onesamplemeans means = 10 stddev = power=. ntotal=10 ; plot x=n min=10 max=50 ; run; Total Sample Size

58 proc power; onesamplemeans means = stddev = /* 10*sqrt(2) 15*sqrt(2) */ power=. ntotal=10 ; plot x=n min=10 max=50 ; run; Total Sample Size Mean Std Dev

59 References 1)Piantadosi, S. (1997), Clinical Trials, A methodological perspectives, John Wiley & Sons, New York. 2)Machin, D. et al. (1987), Sample size tables for clinical studies, Blackwell Sciences, London. 3)ScheinChung Chow and Jenpei Liu (1998), Design and analysis of clinical trials: Concepts and methodologies, Wiley 4)Schuster, J.J. (1993), Practical handbook of sample size guidelines for clinical trials, CRC Press, Florida. 5) 김호 (2002), 적절한연구대상수의산출, 대한마취과학회지 : 42(1), 110.

Sample Size/Power Calculation by Software/Online Calculators

Sample Size/Power Calculation by Software/Online Calculators Sample Size/Power Calculation by Software/Online Calculators May 24, 2018 Li Zhang, Ph.D. li.zhang@ucsf.edu Associate Professor Department of Epidemiology and Biostatistics Division of Hematology and Oncology

More information

Power and Sample Size Bios 662

Power and Sample Size Bios 662 Power and Sample Size Bios 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-10-31 14:06 BIOS 662 1 Power and Sample Size Outline Introduction One sample: continuous

More information

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials Lecture : Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 27 Binomial Model n independent trials (e.g., coin tosses) p = probability of success on each trial (e.g., p =! =

More information

Lecture 10: Introduction to Logistic Regression

Lecture 10: Introduction to Logistic Regression Lecture 10: Introduction to Logistic Regression Ani Manichaikul amanicha@jhsph.edu 2 May 2007 Logistic Regression Regression for a response variable that follows a binomial distribution Recall the binomial

More information

E509A: Principle of Biostatistics. GY Zou

E509A: Principle of Biostatistics. GY Zou E509A: Principle of Biostatistics (Week 4: Inference for a single mean ) GY Zou gzou@srobarts.ca Example 5.4. (p. 183). A random sample of n =16, Mean I.Q is 106 with standard deviation S =12.4. What

More information

Unit 9: Inferences for Proportions and Count Data

Unit 9: Inferences for Proportions and Count Data Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 12/15/2008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)

More information

POWER FOR COMPARING TWO PROPORTIONS WITH INDEPENDENT SAMPLES

POWER FOR COMPARING TWO PROPORTIONS WITH INDEPENDENT SAMPLES This handout covers material found in Section 0.5 of the text. POWER FOR COMPARING TWO PROPORTIONS WITH INDEPENDENT SAMPLES EXAMPLE: Otolaryngology (Example 0.3 of your text, page 405). Suppose a study

More information

Power and the computation of sample size

Power and the computation of sample size 9 Power and the computation of sample size A statistical test will not be able to detect a true difference if the sample size is too small compared with the magnitude of the difference. When designing

More information

Comparing Means from Two-Sample

Comparing Means from Two-Sample Comparing Means from Two-Sample Kwonsang Lee University of Pennsylvania kwonlee@wharton.upenn.edu April 3, 2015 Kwonsang Lee STAT111 April 3, 2015 1 / 22 Inference from One-Sample We have two options to

More information

Using SAS Proc Power to Perform Model-based Power Analysis for Clinical Pharmacology Studies Peng Sun, Merck & Co., Inc.

Using SAS Proc Power to Perform Model-based Power Analysis for Clinical Pharmacology Studies Peng Sun, Merck & Co., Inc. PharmaSUG00 - Paper SP05 Using SAS Proc Power to Perform Model-based Power Analysis for Clinical Pharmacology Studies Peng Sun, Merck & Co., Inc., North Wales, PA ABSRAC In this paper, we demonstrate that

More information

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS

Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS Stat 135, Fall 2006 A. Adhikari HOMEWORK 6 SOLUTIONS 1a. Under the null hypothesis X has the binomial (100,.5) distribution with E(X) = 50 and SE(X) = 5. So P ( X 50 > 10) is (approximately) two tails

More information

General Linear Model (Chapter 4)

General Linear Model (Chapter 4) General Linear Model (Chapter 4) Outcome variable is considered continuous Simple linear regression Scatterplots OLS is BLUE under basic assumptions MSE estimates residual variance testing regression coefficients

More information

Unit 9: Inferences for Proportions and Count Data

Unit 9: Inferences for Proportions and Count Data Unit 9: Inferences for Proportions and Count Data Statistics 571: Statistical Methods Ramón V. León 1/15/008 Unit 9 - Stat 571 - Ramón V. León 1 Large Sample Confidence Interval for Proportion ( pˆ p)

More information

Paper Equivalence Tests. Fei Wang and John Amrhein, McDougall Scientific Ltd.

Paper Equivalence Tests. Fei Wang and John Amrhein, McDougall Scientific Ltd. Paper 11683-2016 Equivalence Tests Fei Wang and John Amrhein, McDougall Scientific Ltd. ABSTRACT Motivated by the frequent need for equivalence tests in clinical trials, this paper provides insights into

More information

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples Objective Section 9.4 Inferences About Two Means (Matched Pairs) Compare of two matched-paired means using two samples from each population. Hypothesis Tests and Confidence Intervals of two dependent means

More information

Optimal rejection regions for testing multiple binary endpoints in small samples

Optimal rejection regions for testing multiple binary endpoints in small samples Optimal rejection regions for testing multiple binary endpoints in small samples Robin Ristl and Martin Posch Section for Medical Statistics, Center of Medical Statistics, Informatics and Intelligent Systems,

More information

Inference for Binomial Parameters

Inference for Binomial Parameters Inference for Binomial Parameters Dipankar Bandyopadhyay, Ph.D. Department of Biostatistics, Virginia Commonwealth University D. Bandyopadhyay (VCU) BIOS 625: Categorical Data & GLM 1 / 58 Inference for

More information

Practice Questions: Statistics W1111, Fall Solutions

Practice Questions: Statistics W1111, Fall Solutions Practice Questions: Statistics W, Fall 9 Solutions Question.. The standard deviation of Z is 89... P(=6) =..3. is definitely inside of a 95% confidence interval for..4. (a) YES (b) YES (c) NO (d) NO Questions

More information

Sample Size Determination

Sample Size Determination Sample Size Determination 018 The number of subjects in a clinical study should always be large enough to provide a reliable answer to the question(s addressed. The sample size is usually determined by

More information

Outline. PubH 5450 Biostatistics I Prof. Carlin. Confidence Interval for the Mean. Part I. Reviews

Outline. PubH 5450 Biostatistics I Prof. Carlin. Confidence Interval for the Mean. Part I. Reviews Outline Outline PubH 5450 Biostatistics I Prof. Carlin Lecture 11 Confidence Interval for the Mean Known σ (population standard deviation): Part I Reviews σ x ± z 1 α/2 n Small n, normal population. Large

More information

LECTURE 12 CONFIDENCE INTERVAL AND HYPOTHESIS TESTING

LECTURE 12 CONFIDENCE INTERVAL AND HYPOTHESIS TESTING LECTURE 1 CONFIDENCE INTERVAL AND HYPOTHESIS TESTING INTERVAL ESTIMATION Point estimation of : The inference is a guess of a single value as the value of. No accuracy associated with it. Interval estimation

More information

Medical statistics part I, autumn 2010: One sample test of hypothesis

Medical statistics part I, autumn 2010: One sample test of hypothesis Medical statistics part I, autumn 2010: One sample test of hypothesis Eirik Skogvoll Consultant/ Professor Faculty of Medicine Dept. of Anaesthesiology and Emergency Medicine 1 What is a hypothesis test?

More information

On Assumptions. On Assumptions

On Assumptions. On Assumptions On Assumptions An overview Normality Independence Detection Stem-and-leaf plot Study design Normal scores plot Correction Transformation More complex models Nonparametric procedure e.g. time series Robustness

More information

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing

STAT 135 Lab 5 Bootstrapping and Hypothesis Testing STAT 135 Lab 5 Bootstrapping and Hypothesis Testing Rebecca Barter March 2, 2015 The Bootstrap Bootstrap Suppose that we are interested in estimating a parameter θ from some population with members x 1,...,

More information

BIOS 312: Precision of Statistical Inference

BIOS 312: Precision of Statistical Inference and Power/Sample Size and Standard Errors BIOS 312: of Statistical Inference Chris Slaughter Department of Biostatistics, Vanderbilt University School of Medicine January 3, 2013 Outline Overview and Power/Sample

More information

Introduction to Crossover Trials

Introduction to Crossover Trials Introduction to Crossover Trials Stat 6500 Tutorial Project Isaac Blackhurst A crossover trial is a type of randomized control trial. It has advantages over other designed experiments because, under certain

More information

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă HYPOTHESIS TESTING II TESTS ON MEANS Sorana D. Bolboacă OBJECTIVES Significance value vs p value Parametric vs non parametric tests Tests on means: 1 Dec 14 2 SIGNIFICANCE LEVEL VS. p VALUE Materials and

More information

E509A: Principle of Biostatistics. GY Zou

E509A: Principle of Biostatistics. GY Zou E509A: Principle of Biostatistics (Effect measures ) GY Zou gzou@robarts.ca We have discussed inference procedures for 2 2 tables in the context of comparing two groups. Yes No Group 1 a b n 1 Group 2

More information

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression

Acknowledgements. Outline. Marie Diener-West. ICTR Leadership / Team INTRODUCTION TO CLINICAL RESEARCH. Introduction to Linear Regression INTRODUCTION TO CLINICAL RESEARCH Introduction to Linear Regression Karen Bandeen-Roche, Ph.D. July 17, 2012 Acknowledgements Marie Diener-West Rick Thompson ICTR Leadership / Team JHU Intro to Clinical

More information

Lecture 9 Two-Sample Test. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech

Lecture 9 Two-Sample Test. Fall 2013 Prof. Yao Xie, H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech Lecture 9 Two-Sample Test Fall 2013 Prof. Yao Xie, yao.xie@isye.gatech.edu H. Milton Stewart School of Industrial Systems & Engineering Georgia Tech Computer exam 1 18 Histogram 14 Frequency 9 5 0 75 83.33333333

More information

An interval estimator of a parameter θ is of the form θl < θ < θu at a

An interval estimator of a parameter θ is of the form θl < θ < θu at a Chapter 7 of Devore CONFIDENCE INTERVAL ESTIMATORS An interval estimator of a parameter θ is of the form θl < θ < θu at a confidence pr (or a confidence coefficient) of 1 α. When θl =, < θ < θu is called

More information

Population Variance. Concepts from previous lectures. HUMBEHV 3HB3 one-sample t-tests. Week 8

Population Variance. Concepts from previous lectures. HUMBEHV 3HB3 one-sample t-tests. Week 8 Concepts from previous lectures HUMBEHV 3HB3 one-sample t-tests Week 8 Prof. Patrick Bennett sampling distributions - sampling error - standard error of the mean - degrees-of-freedom Null and alternative/research

More information

Chapter Six: Two Independent Samples Methods 1/51

Chapter Six: Two Independent Samples Methods 1/51 Chapter Six: Two Independent Samples Methods 1/51 6.3 Methods Related To Differences Between Proportions 2/51 Test For A Difference Between Proportions:Introduction Suppose a sampling distribution were

More information

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests

z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests z and t tests for the mean of a normal distribution Confidence intervals for the mean Binomial tests Chapters 3.5.1 3.5.2, 3.3.2 Prof. Tesler Math 283 Fall 2018 Prof. Tesler z and t tests for mean Math

More information

The SEQDESIGN Procedure

The SEQDESIGN Procedure SAS/STAT 9.2 User s Guide, Second Edition The SEQDESIGN Procedure (Book Excerpt) This document is an individual chapter from the SAS/STAT 9.2 User s Guide, Second Edition. The correct bibliographic citation

More information

TUTORIAL 8 SOLUTIONS #

TUTORIAL 8 SOLUTIONS # TUTORIAL 8 SOLUTIONS #9.11.21 Suppose that a single observation X is taken from a uniform density on [0,θ], and consider testing H 0 : θ = 1 versus H 1 : θ =2. (a) Find a test that has significance level

More information

Epidemiology Principle of Biostatistics Chapter 11 - Inference about probability in a single population. John Koval

Epidemiology Principle of Biostatistics Chapter 11 - Inference about probability in a single population. John Koval Epidemiology 9509 Principle of Biostatistics Chapter 11 - Inference about probability in a single population John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is

More information

Chapter 9. Inferences from Two Samples. Objective. Notation. Section 9.2. Definition. Notation. q = 1 p. Inferences About Two Proportions

Chapter 9. Inferences from Two Samples. Objective. Notation. Section 9.2. Definition. Notation. q = 1 p. Inferences About Two Proportions Chapter 9 Inferences from Two Samples 9. Inferences About Two Proportions 9.3 Inferences About Two s (Independent) 9.4 Inferences About Two s (Matched Pairs) 9.5 Comparing Variation in Two Samples Objective

More information

F79SM STATISTICAL METHODS

F79SM STATISTICAL METHODS F79SM STATISTICAL METHODS SUMMARY NOTES 9 Hypothesis testing 9.1 Introduction As before we have a random sample x of size n of a population r.v. X with pdf/pf f(x;θ). The distribution we assign to X is

More information

4.8 Alternate Analysis as a Oneway ANOVA

4.8 Alternate Analysis as a Oneway ANOVA 4.8 Alternate Analysis as a Oneway ANOVA Suppose we have data from a two-factor factorial design. The following method can be used to perform a multiple comparison test to compare treatment means as well

More information

+ Specify 1 tail / 2 tail

+ Specify 1 tail / 2 tail Week 2: Null hypothesis Aeroplane seat designer wonders how wide to make the plane seats. He assumes population average hip size μ = 43.2cm Sample size n = 50 Question : Is the assumption μ = 43.2cm reasonable?

More information

Lecture 10: Comparing two populations: proportions

Lecture 10: Comparing two populations: proportions Lecture 10: Comparing two populations: proportions Problem: Compare two sets of sample data: e.g. is the proportion of As in this semester 152 the same as last Fall? Methods: Extend the methods introduced

More information

Statistics in medicine

Statistics in medicine Statistics in medicine Lecture 3: Bivariate association : Categorical variables Proportion in one group One group is measured one time: z test Use the z distribution as an approximation to the binomial

More information

6 Single Sample Methods for a Location Parameter

6 Single Sample Methods for a Location Parameter 6 Single Sample Methods for a Location Parameter If there are serious departures from parametric test assumptions (e.g., normality or symmetry), nonparametric tests on a measure of central tendency (usually

More information

Analysis of 2x2 Cross-Over Designs using T-Tests

Analysis of 2x2 Cross-Over Designs using T-Tests Chapter 234 Analysis of 2x2 Cross-Over Designs using T-Tests Introduction This procedure analyzes data from a two-treatment, two-period (2x2) cross-over design. The response is assumed to be a continuous

More information

Topic 19 Extensions on the Likelihood Ratio

Topic 19 Extensions on the Likelihood Ratio Topic 19 Extensions on the Likelihood Ratio Two-Sided Tests 1 / 12 Outline Overview Normal Observations Power Analysis 2 / 12 Overview The likelihood ratio test is a popular choice for composite hypothesis

More information

CMU MSP 36726: Power

CMU MSP 36726: Power CMU MSP 36726: Power H. Seltman 2/21/2018 I. Consider three experiments: 1) Recruitment, randomization, priming with your group does well/poorly in math, math testing 2) Recruitment, each subject gets

More information

Hypothesis tests

Hypothesis tests 6.1 6.4 Hypothesis tests Prof. Tesler Math 186 February 26, 2014 Prof. Tesler 6.1 6.4 Hypothesis tests Math 186 / February 26, 2014 1 / 41 6.1 6.2 Intro to hypothesis tests and decision rules Hypothesis

More information

Population 1 Population 2

Population 1 Population 2 Two Population Case Testing the Difference Between Two Population Means Sample of Size n _ Sample mean = x Sample s.d.=s x Sample of Size m _ Sample mean = y Sample s.d.=s y Pop n mean=μ x Pop n s.d.=

More information

Sample Size and Power I: Binary Outcomes. James Ware, PhD Harvard School of Public Health Boston, MA

Sample Size and Power I: Binary Outcomes. James Ware, PhD Harvard School of Public Health Boston, MA Sample Size and Power I: Binary Outcomes James Ware, PhD Harvard School of Public Health Boston, MA Sample Size and Power Principles: Sample size calculations are an essential part of study design Consider

More information

Nonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I

Nonparametric tests. Timothy Hanson. Department of Statistics, University of South Carolina. Stat 704: Data Analysis I 1 / 16 Nonparametric tests Timothy Hanson Department of Statistics, University of South Carolina Stat 704: Data Analysis I Nonparametric one and two-sample tests 2 / 16 If data do not come from a normal

More information

Exam 2 (KEY) July 20, 2009

Exam 2 (KEY) July 20, 2009 STAT 2300 Business Statistics/Summer 2009, Section 002 Exam 2 (KEY) July 20, 2009 Name: USU A#: Score: /225 Directions: This exam consists of six (6) questions, assessing material learned within Modules

More information

Power and sample size calculations

Power and sample size calculations Power and sample size calculations Susanne Rosthøj Biostatistisk Afdeling Institut for Folkesundhedsvidenskab Københavns Universitet sr@biostat.ku.dk April 8, 2014 Planning an investigation How many individuals

More information

ph: 5.2, 5.6, 5.8, 6.4, 6.5, 6.8, 6.9, 7.2, 7.5 sample mean = sample sd = sample size, n = 9

ph: 5.2, 5.6, 5.8, 6.4, 6.5, 6.8, 6.9, 7.2, 7.5 sample mean = sample sd = sample size, n = 9 Name: SOLUTIONS Final Part 1 (100 pts) and Final Part 2 (120 pts) For all of the questions below, please show enough work that it is completely clear how your final solution was derived. Sit at least one

More information

One-sample categorical data: approximate inference

One-sample categorical data: approximate inference One-sample categorical data: approximate inference Patrick Breheny October 6 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction It is relatively easy to think about the distribution

More information

Sample Size / Power Calculations

Sample Size / Power Calculations Sample Size / Power Calculations A Simple Example Goal: To study the effect of cold on blood pressure (mmhg) in rats Use a Completely Randomized Design (CRD): 12 rats are randomly assigned to one of two

More information

ST505/S697R: Fall Homework 2 Solution.

ST505/S697R: Fall Homework 2 Solution. ST505/S69R: Fall 2012. Homework 2 Solution. 1. 1a; problem 1.22 Below is the summary information (edited) from the regression (using R output); code at end of solution as is code and output for SAS. a)

More information

Comparing two independent samples

Comparing two independent samples In many applications it is necessary to compare two competing methods (for example, to compare treatment effects of a standard drug and an experimental drug). To compare two methods from statistical point

More information

Lab #12: Exam 3 Review Key

Lab #12: Exam 3 Review Key Psychological Statistics Practice Lab#1 Dr. M. Plonsky Page 1 of 7 Lab #1: Exam 3 Review Key 1) a. Probability - Refers to the likelihood that an event will occur. Ranges from 0 to 1. b. Sampling Distribution

More information

Interval estimation. October 3, Basic ideas CLT and CI CI for a population mean CI for a population proportion CI for a Normal mean

Interval estimation. October 3, Basic ideas CLT and CI CI for a population mean CI for a population proportion CI for a Normal mean Interval estimation October 3, 2018 STAT 151 Class 7 Slide 1 Pandemic data Treatment outcome, X, from n = 100 patients in a pandemic: 1 = recovered and 0 = not recovered 1 1 1 0 0 0 1 1 1 0 0 1 0 1 0 0

More information

Section Comparing Two Proportions

Section Comparing Two Proportions Section 8.2 - Comparing Two Proportions Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin Comparing Two Proportions Two-sample problems Want to compare the responses in two groups or treatments

More information

Power of a test. Hypothesis testing

Power of a test. Hypothesis testing Hypothesis testing February 11, 2014 Debdeep Pati Power of a test 1. Assuming standard deviation is known. Calculate power based on one-sample z test. A new drug is proposed for people with high intraocular

More information

Power Analysis for One-Way ANOVA

Power Analysis for One-Way ANOVA Chapter 12 Power Analysis for One-Way ANOVA Recall that the power of a statistical test is the probability of rejecting H 0 when H 0 is false, and some alternative hypothesis H 1 is true. We saw earlier

More information

BIO5312 Biostatistics Lecture 6: Statistical hypothesis testings

BIO5312 Biostatistics Lecture 6: Statistical hypothesis testings BIO5312 Biostatistics Lecture 6: Statistical hypothesis testings Yujin Chung October 4th, 2016 Fall 2016 Yujin Chung Lec6: Statistical hypothesis testings Fall 2016 1/30 Previous Two types of statistical

More information

Introduction to Power and Sample Size Analysis

Introduction to Power and Sample Size Analysis Four chapters from SAS/STAT User's Guide 13.2: (18) Introduction to Power and Sample Size Analysis (47) The GLMPOWER Procedure (77) The POWER Procedure (78) The Power and Sample Size Application Chapter

More information

Two Sample Problems. Two sample problems

Two Sample Problems. Two sample problems Two Sample Problems Two sample problems The goal of inference is to compare the responses in two groups. Each group is a sample from a different population. The responses in each group are independent

More information

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants

Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009 there were participants 18.650 Statistics for Applications Chapter 5: Parametric hypothesis testing 1/37 Cherry Blossom run (1) The credit union Cherry Blossom Run is a 10 mile race that takes place every year in D.C. In 2009

More information

Unit 9 ONE Sample Inference SOLUTIONS

Unit 9 ONE Sample Inference SOLUTIONS BIOSTATS 540 Fall 017 Introductory Biostatistics Solutions Unit 9 ONE Sample Inference SOLUTIONS 1. This exercise gives you practice calculating a confidence interval for the mean of a Normal distribution

More information

CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity

CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity Prof. Kevin E. Thorpe Dept. of Public Health Sciences University of Toronto Objectives 1. Be able to distinguish among the various

More information

1; (f) H 0 : = 55 db, H 1 : < 55.

1; (f) H 0 : = 55 db, H 1 : < 55. Reference: Chapter 8 of J. L. Devore s 8 th Edition By S. Maghsoodloo TESTING a STATISTICAL HYPOTHESIS A statistical hypothesis is an assumption about the frequency function(s) (i.e., pmf or pdf) of one

More information

PLS205 Lab 2 January 15, Laboratory Topic 3

PLS205 Lab 2 January 15, Laboratory Topic 3 PLS205 Lab 2 January 15, 2015 Laboratory Topic 3 General format of ANOVA in SAS Testing the assumption of homogeneity of variances by "/hovtest" by ANOVA of squared residuals Proc Power for ANOVA One-way

More information

Formulas and Tables. for Essentials of Statistics, by Mario F. Triola 2002 by Addison-Wesley. ˆp E p ˆp E Proportion.

Formulas and Tables. for Essentials of Statistics, by Mario F. Triola 2002 by Addison-Wesley. ˆp E p ˆp E Proportion. Formulas and Tables for Essentials of Statistics, by Mario F. Triola 2002 by Addison-Wesley. Ch. 2: Descriptive Statistics x Sf. x x Sf Mean S(x 2 x) 2 s Å n 2 1 n(sx 2 ) 2 (Sx) 2 s Å n(n 2 1) Mean (frequency

More information

Dose-response modeling with bivariate binary data under model uncertainty

Dose-response modeling with bivariate binary data under model uncertainty Dose-response modeling with bivariate binary data under model uncertainty Bernhard Klingenberg 1 1 Department of Mathematics and Statistics, Williams College, Williamstown, MA, 01267 and Institute of Statistics,

More information

Chapter 8. Inferences Based on a Two Samples Confidence Intervals and Tests of Hypothesis

Chapter 8. Inferences Based on a Two Samples Confidence Intervals and Tests of Hypothesis Chapter 8 Inferences Based on a Two Samples Confidence Intervals and Tests of Hypothesis Copyright 2018, 2014, and 2011 Pearson Education, Inc. Slide - 1 Content 1. Identifying the Target Parameter 2.

More information

Epidemiology Principle of Biostatistics Chapter 14 - Dependent Samples and effect measures. John Koval

Epidemiology Principle of Biostatistics Chapter 14 - Dependent Samples and effect measures. John Koval Epidemiology 9509 Principle of Biostatistics Chapter 14 - Dependent Samples and effect measures John Koval Department of Epidemiology and Biostatistics University of Western Ontario What is being covered

More information

STAT Chapter 8: Hypothesis Tests

STAT Chapter 8: Hypothesis Tests STAT 515 -- Chapter 8: Hypothesis Tests CIs are possibly the most useful forms of inference because they give a range of reasonable values for a parameter. But sometimes we want to know whether one particular

More information

STATISTICAL INFERENCE PART II CONFIDENCE INTERVALS AND HYPOTHESIS TESTING

STATISTICAL INFERENCE PART II CONFIDENCE INTERVALS AND HYPOTHESIS TESTING STATISTICAL INFERENCE PART II CONFIDENCE INTERVALS AND HYPOTHESIS TESTING 1 LOCATION PARAMETER Let f(x) be any pdf. The family of pdfs f(x) indexed by parameter, is called the location family with standard

More information

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary Patrick Breheny October 13 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction Introduction What s wrong with z-tests? So far we ve (thoroughly!) discussed how to carry out hypothesis

More information

Central Limit Theorem ( 5.3)

Central Limit Theorem ( 5.3) Central Limit Theorem ( 5.3) Let X 1, X 2,... be a sequence of independent random variables, each having n mean µ and variance σ 2. Then the distribution of the partial sum S n = X i i=1 becomes approximately

More information

9-6. Testing the difference between proportions /20

9-6. Testing the difference between proportions /20 9-6 Testing the difference between proportions 1 Homework Discussion Question p514 Ex 9-6 p514 2, 3, 4, 7, 9, 11 (use both the critical value and p-value for all problems. 2 Objective Perform hypothesis

More information

SAS/STAT 15.1 User s Guide The SEQDESIGN Procedure

SAS/STAT 15.1 User s Guide The SEQDESIGN Procedure SAS/STAT 15.1 User s Guide The SEQDESIGN Procedure This document is an individual chapter from SAS/STAT 15.1 User s Guide. The correct bibliographic citation for this manual is as follows: SAS Institute

More information

Goodness of Fit Goodness of fit - 2 classes

Goodness of Fit Goodness of fit - 2 classes Goodness of Fit Goodness of fit - 2 classes A B 78 22 Do these data correspond reasonably to the proportions 3:1? We previously discussed options for testing p A = 0.75! Exact p-value Exact confidence

More information

Frequency table: Var2 (Spreadsheet1) Count Cumulative Percent Cumulative From To. Percent <x<=

Frequency table: Var2 (Spreadsheet1) Count Cumulative Percent Cumulative From To. Percent <x<= A frequency distribution is a kind of probability distribution. It gives the frequency or relative frequency at which given values have been observed among the data collected. For example, for age, Frequency

More information

Reference: Chapter 7 of Devore (8e)

Reference: Chapter 7 of Devore (8e) Reference: Chapter 7 of Devore (8e) CONFIDENCE INTERVAL ESTIMATORS Maghsoodloo An interval estimator of a population parameter is of the form L < < u at a confidence Pr (or a confidence coefficient) of

More information

Lab #11. Variable B. Variable A Y a b a+b N c d c+d a+c b+d N = a+b+c+d

Lab #11. Variable B. Variable A Y a b a+b N c d c+d a+c b+d N = a+b+c+d BIOS 4120: Introduction to Biostatistics Breheny Lab #11 We will explore observational studies in today s lab and review how to make inferences on contingency tables. We will only use 2x2 tables for today

More information

Formulas and Tables. for Elementary Statistics, Tenth Edition, by Mario F. Triola Copyright 2006 Pearson Education, Inc. ˆp E p ˆp E Proportion

Formulas and Tables. for Elementary Statistics, Tenth Edition, by Mario F. Triola Copyright 2006 Pearson Education, Inc. ˆp E p ˆp E Proportion Formulas and Tables for Elementary Statistics, Tenth Edition, by Mario F. Triola Copyright 2006 Pearson Education, Inc. Ch. 3: Descriptive Statistics x Sf. x x Sf Mean S(x 2 x) 2 s Å n 2 1 n(sx 2 ) 2 (Sx)

More information

L6: Regression II. JJ Chen. July 2, 2015

L6: Regression II. JJ Chen. July 2, 2015 L6: Regression II JJ Chen July 2, 2015 Today s Plan Review basic inference based on Sample average Difference in sample average Extrapolate the knowledge to sample regression coefficients Standard error,

More information

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto.

Clinical Trials. Olli Saarela. September 18, Dalla Lana School of Public Health University of Toronto. Introduction to Dalla Lana School of Public Health University of Toronto olli.saarela@utoronto.ca September 18, 2014 38-1 : a review 38-2 Evidence Ideal: to advance the knowledge-base of clinical medicine,

More information

Mock Exam - 2 hours - use of basic (non-programmable) calculator is allowed - all exercises carry the same marks - exam is strictly individual

Mock Exam - 2 hours - use of basic (non-programmable) calculator is allowed - all exercises carry the same marks - exam is strictly individual Mock Exam - 2 hours - use of basic (non-programmable) calculator is allowed - all exercises carry the same marks - exam is strictly individual Question 1. Suppose you want to estimate the percentage of

More information

Chapter 5: HYPOTHESIS TESTING

Chapter 5: HYPOTHESIS TESTING MATH411: Applied Statistics Dr. YU, Chi Wai Chapter 5: HYPOTHESIS TESTING 1 WHAT IS HYPOTHESIS TESTING? As its name indicates, it is about a test of hypothesis. To be more precise, we would first translate

More information

Lecture 11: Simple Linear Regression

Lecture 11: Simple Linear Regression Lecture 11: Simple Linear Regression Readings: Sections 3.1-3.3, 11.1-11.3 Apr 17, 2009 In linear regression, we examine the association between two quantitative variables. Number of beers that you drink

More information

Statistical Inference

Statistical Inference Statistical Inference Bernhard Klingenberg Institute of Statistics Graz University of Technology Steyrergasse 17/IV, 8010 Graz www.statistics.tugraz.at February 12, 2008 Outline Estimation: Review of concepts

More information

Section Inference for a Single Proportion

Section Inference for a Single Proportion Section 8.1 - Inference for a Single Proportion Statistics 104 Autumn 2004 Copyright c 2004 by Mark E. Irwin Inference for a Single Proportion For most of what follows, we will be making two assumptions

More information

GOV 2001/ 1002/ E-2001 Section 3 Theories of Inference

GOV 2001/ 1002/ E-2001 Section 3 Theories of Inference GOV 2001/ 1002/ E-2001 Section 3 Theories of Inference Solé Prillaman Harvard University February 11, 2015 1 / 48 LOGISTICS Reading Assignment- Unifying Political Methodology chs 2 and 4. Problem Set 3-

More information

Inference for Single Proportions and Means T.Scofield

Inference for Single Proportions and Means T.Scofield Inference for Single Proportions and Means TScofield Confidence Intervals for Single Proportions and Means A CI gives upper and lower bounds between which we hope to capture the (fixed) population parameter

More information

Lecture 5: Determining Sample Size

Lecture 5: Determining Sample Size Lecture 5: Determining Sample Size Montgomery: Section 3.7 and 13.4 1 Lecture 5 Page 1 Choice of Sample Size: Fixed Effects Can determine the sample size for OverallF test Contrasts of interest For simplicity,

More information

Chapter 3. Comparing two populations

Chapter 3. Comparing two populations Chapter 3. Comparing two populations Contents Hypothesis for the difference between two population means: matched pairs Hypothesis for the difference between two population means: independent samples Two

More information

Week 7.1--IES 612-STA STA doc

Week 7.1--IES 612-STA STA doc Week 7.1--IES 612-STA 4-573-STA 4-576.doc IES 612/STA 4-576 Winter 2009 ANOVA MODELS model adequacy aka RESIDUAL ANALYSIS Numeric data samples from t populations obtained Assume Y ij ~ independent N(μ

More information

Chapter 7. Hypothesis Testing

Chapter 7. Hypothesis Testing Chapter 7. Hypothesis Testing Joonpyo Kim June 24, 2017 Joonpyo Kim Ch7 June 24, 2017 1 / 63 Basic Concepts of Testing Suppose that our interest centers on a random variable X which has density function

More information

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios

ST3241 Categorical Data Analysis I Two-way Contingency Tables. 2 2 Tables, Relative Risks and Odds Ratios ST3241 Categorical Data Analysis I Two-way Contingency Tables 2 2 Tables, Relative Risks and Odds Ratios 1 What Is A Contingency Table (p.16) Suppose X and Y are two categorical variables X has I categories

More information