Sample Size Determination

Similar documents
Study Design: Sample Size Calculation & Power Analysis

Chapter Six: Two Independent Samples Methods 1/51

S D / n t n 1 The paediatrician observes 3 =

BIOS 2083: Linear Models

Hypothesis Testing. Hypothesis: conjecture, proposition or statement based on published literature, data, or a theory that may or may not be true

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

Adaptive designs beyond p-value combination methods. Ekkehard Glimm, Novartis Pharma EAST user group meeting Basel, 31 May 2013

Uniformly Most Powerful Bayesian Tests and Standards for Statistical Evidence

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

6 Sample Size Calculations

Sample Size. Vorasith Sornsrivichai, MD., FETP Epidemiology Unit, Faculty of Medicine Prince of Songkla University

Sample Size Calculations

The SEQDESIGN Procedure

Sample Size and Power I: Binary Outcomes. James Ware, PhD Harvard School of Public Health Boston, MA

Discrete Multivariate Statistics

Statistics in medicine

Relating Graph to Matlab

Binomial Model. Lecture 10: Introduction to Logistic Regression. Logistic Regression. Binomial Distribution. n independent trials

Lecture 10: Introduction to Logistic Regression

Chapter Three. Hypothesis Testing

Analysis of Time-to-Event Data: Chapter 4 - Parametric regression models

Hypothesis Testing, Power, Sample Size and Confidence Intervals (Part 2)

Statistics for IT Managers

10.1. Comparing Two Proportions. Section 10.1

Introduction to Statistical Analysis

Two-stage Adaptive Randomization for Delayed Response in Clinical Trials

Statistics Boot Camp. Dr. Stephanie Lane Institute for Defense Analyses DATAWorks 2018

Power and the computation of sample size

Basics of Experimental Design. Review of Statistics. Basic Study. Experimental Design. When an Experiment is Not Possible. Studying Relations

Power and nonparametric methods Basic statistics for experimental researchersrs 2017

APPENDIX B Sample-Size Calculation Methods: Classical Design

Survival Analysis. Lu Tian and Richard Olshen Stanford University

STAT Section 5.8: Block Designs

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

Chapter 5: HYPOTHESIS TESTING

Comparison of several analysis methods for recurrent event data for dierent estimands

Statistics: revision

Difference in two or more average scores in different groups

Review. One-way ANOVA, I. What s coming up. Multiple comparisons

Pubh 8482: Sequential Analysis

Optimising Group Sequential Designs. Decision Theory, Dynamic Programming. and Optimal Stopping

The t-test Pivots Summary. Pivots and t-tests. Patrick Breheny. October 15. Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/18

Accounting for Baseline Observations in Randomized Clinical Trials

Contents 1. Contents

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

A Clinical Trial Simulation System, Its Applications, and Future Challenges. Peter Westfall, Texas Tech University Kuenhi Tsai, Merck Research Lab

Hypothesis Test. The opposite of the null hypothesis, called an alternative hypothesis, becomes

BIOS 2083 Linear Models c Abdus S. Wahed

Epidemiology Principles of Biostatistics Chapter 10 - Inferences about two populations. John Koval

Comparing Adaptive Designs and the. Classical Group Sequential Approach. to Clinical Trial Design

6 Single Sample Methods for a Location Parameter

Unit 9: Inferences for Proportions and Count Data

Sampling Distributions

Comparing Several Means: ANOVA

Review of Statistics 101

a Sample By:Dr.Hoseyn Falahzadeh 1

Group Sequential Designs: Theory, Computation and Optimisation

Math 494: Mathematical Statistics

Accounting for Baseline Observations in Randomized Clinical Trials

Generalizing the MCPMod methodology beyond normal, independent data

Adaptive Prediction of Event Times in Clinical Trials

STAT 135 Lab 9 Multiple Testing, One-Way ANOVA and Kruskal-Wallis

Mathematical Notation Math Introduction to Applied Statistics

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing

Introduction to Crossover Trials

Pubh 8482: Sequential Analysis

Sampling distribution of t. 2. Sampling distribution of t. 3. Example: Gas mileage investigation. II. Inferential Statistics (8) t =

Unit 9: Inferences for Proportions and Count Data

SAS/STAT 15.1 User s Guide The SEQDESIGN Procedure

SEVERAL μs AND MEDIANS: MORE ISSUES. Business Statistics

Permutation Tests. Noa Haas Statistics M.Sc. Seminar, Spring 2017 Bootstrap and Resampling Methods

Sampling Distributions: Central Limit Theorem

Reports of the Institute of Biostatistics

β j = coefficient of x j in the model; β = ( β1, β2,

Welcome! Webinar Biostatistics: sample size & power. Thursday, April 26, 12:30 1:30 pm (NDT)

Physics 509: Non-Parametric Statistics and Correlation Testing

Lecture 21: October 19

Summary of Chapters 7-9

Regression models. Categorical covariate, Quantitative outcome. Examples of categorical covariates. Group characteristics. Faculty of Health Sciences

CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity

Power of a test. Hypothesis testing

Glossary. The ISI glossary of statistical terms provides definitions in a number of different languages:

Contrasts (in general)

Chapter 7 Fall Chapter 7 Hypothesis testing Hypotheses of interest: (A) 1-sample

Chapter 11. Correlation and Regression

Two sample Test. Paired Data : Δ = 0. Lecture 3: Comparison of Means. d s d where is the sample average of the differences and is the

Comparison of Two Population Means

Chapter 9 Inferences from Two Samples

Inferential Statistics

PubH 7405: REGRESSION ANALYSIS INTRODUCTION TO LOGISTIC REGRESSION

Purposes of Data Analysis. Variables and Samples. Parameters and Statistics. Part 1: Probability Distributions

Finding Critical Values with Prefixed Early. Stopping Boundaries and Controlled Type I. Error for A Two-Stage Adaptive Design

Introduction to Statistical Inference Lecture 10: ANOVA, Kruskal-Wallis Test

Sample Size/Power Calculation by Software/Online Calculators

Subgroup analysis using regression modeling multiple regression. Aeilko H Zwinderman

The Purpose of Hypothesis Testing

Adaptive Designs: Why, How and When?

Intuitive Biostatistics: Choosing a statistical test

HYPOTHESIS TESTING. Hypothesis Testing

Biost 518 Applied Biostatistics II. Purpose of Statistics. First Stage of Scientific Investigation. Further Stages of Scientific Investigation

Transcription:

Sample Size Determination 018

The number of subjects in a clinical study should always be large enough to provide a reliable answer to the question(s addressed. The sample size is usually determined by the primary objective of the trial. Sample size calculation should be explicitly mentioned in the protocol. (from ICH-E9

Errors with Hypothesis Testing A ccept H 0 Reject H 0 False positives E = C OK Type I error E C Type II error OK False negatives Ho: E=C (ineffective H1: E C (effective E: Effect of Experimental Drug C: Effect of Control

Type I Error Concluding for alternative hypothesis (new drug effective while null-hypothesis is true (false positive Probability of Type I error = significance level or level Value needs to be specified in the protocol and should be small Explicit guidance from regulatory authorities, e.g. 0.05 One-sided or two-sided? Multiple testing?

Type II Error Concluding for null-hypothesis (new drug ineffective while alternative hypothesis is true (false negative Probability of Type II error = level Value to be chosen by the sponsor, and should be small No explicit guidance from regulatory authorities, e.g. 0.0, or 0.10 Power =1 - = 1 - Type II error rate = P(reject H0 when H1 is true =P(concluding that the drug is effective when it is Typical values 0.80 or 0.90

Power and sample size Suppose we want to test if a drug is better than a placebo, or if a higher dose is better than a lower dose. Sample size: How many patients should we include in our clinical trial, to give ourselves a good chance of detecting any effects of the drug? Power: Assuming that the drug has an effect, what is the probability that our clinical trial will give a significant result?

Sample Size and Power Sample size is contingent on design, true effect, underlying variation, analysis method, outcome etc. With the wrong sample size, you will either Not be able to make conclusions because the study is underpowered Waste time and money because your study is larger than it needed to be to answer the question of interest

Sample Size and Power Sample size ALWAYS requires the investigator to make some assumptions How much better do you expect the experimental therapy group to perform than the standard therapy groups? How much variability do we expect in measurements? What would be a clinically relevant improvement? The statistician CANNOT tell what these numbers should be It is the responsibility of the clinical investigators to define these parameters

Sample Size Calculation Following items should be specified a primary variable the statistical test method the null hypothesis; the alternative hypothesis; the study design the Type I error the Type II error Variability in the study population way how to deal with treatment withdrawals Presumed effect

Three common settings 1.Continuous outcome: e.g., number of units of blood transfused, CD4 cell counts, LDL-C. 1.Binary outcome: e.g., response vs. no response, disease vs. no disease 1.Time-to-event outcome: e.g., time to progression, time to death.

Continuous outcomes Easiest to discuss Sample size depends on Δ: difference under the null hypothesis α: type 1 error β type error σ: standard deviation r: ratio of number of patients in the two groups (usually r = 1

Tests and Confidence Intervals You can use a confidence interval (CI for hypothesis testing. In the typical case, if the CI for an effect does not span 0 then you can reject the null hypothesis. But a CI can be used for more. For example, You can make a statement about the range of effects you believe to be likely (the ones in the CI. You can't do that with just a t-test. You can also use it to make statements about the null, which you can't do with a t-test. If the t-test doesn't reject the null then you just say that you can't reject the null, which isn't saying much. But if you have a narrow confidence interval around the null then you can suggest that the null, or a value close to it, is likely the true value and suggest the effect of the treatment is too small to be meaningful.

One-sample Assume that we wish to study a hypothesis related to a sample of independent and identically distributed normal random variables with mean and standard deviation. Assuming is known, a (1-100% confidence interval for is given by Y Z( The maximum error in estimating is n which is a function of n. If we want to pre specify E, then n can be chosen according to n Z( E n E Z(

This takes care of the problem of determining the sample size required to attain a certain precision in estimating using a confidence interval at a certain confidence level. Since a confidence interval is equivalent to a test, the above procedure can also be used to test a null hypothesis. Suppose thus that we wish to test the hypothesis H 0 : μ 1 = μ 0 H a : μ a > μ 0 with a significance level. For a specific alternative H 1 : μ a = μ 0 + where >0, the power of the test is given by

H 0 : μ 1 μ = 0 H 1 : μ 1 μ 0 Two samples With known variances 1 and, the power against a specific alternative μ 1 = μ + is given by

is normal so 1 ( ( ( ( ( ( Z Z n Z Z Z Z Z P d d d In the one sided case we have 1 ( ( Z Z n

Equal variances For testing the difference in two means, with (equal variance and equal allocation to each arm: n 1 ( Z Z

One proportion We want to test H 0 : P = P 0 H a : P >P 1 with a certain power (1- and with a certain significance level (. The required sample of size is n Z P (1 P Z 0 0 P1 (1 P1 p p 0 1

We want to test H 0 : P 1 = P H a : P 1 P with a certain power (1- and with a certain significance level (. The required sample of size is Two proportion (, (1 (1 (1 / ( 1 1 1 1 P P P p p P P P P Z P P Z n

A sample size of 57 in each group will have 80% power to detect a difference in means of 0,08 assuming that the common standard deviation is 0,15 using a two group t-test with a 0,05 two-sided significance level.

A sample size of 50 in each group will have 80,756% power to detect a difference in means of 4 assuming that the common standard deviation is 7 using a two group t- test with a 0,05 two-sided significance level.

Analysis of variance We want to to test H 0 : 1 = = = k H a : at least one i is not zero Take: ( ( 1 ( ( ( ( (,,,, (, 1 ( 1, 1 * 1 1 * 1 1 1 1 1 * 1 A A i A F F Z n F F k N n k k The last equation is difficult to solve but there are tables.

StudySize

Survival analysis Assume we plan a RCT aiming at comparing a new treatment (drug with an old one (control. Such a comparison can be made in terms of the hazard ratio or some function of this. Let d stand for the drug and c for control In the sequel we describe the sample size required to demonstrate that the drug is better than the control treatment. This will be made under some assumptions in two steps

Step 1: specify the number of events needed Step : specify the number of patients needed to obtain the the right number of events. Step 1 In what follows we will use the notation below to give a formula specifying the number of events needed to have a certain asymptotic power in many different cases like The log-rank test the score test parametric tests based on the exponential distribution etc. p The proportion of individual receiveing the new drug q 1 d C Z p the required number of The critical The proportion value of The upper quantile of the of individual events test receiveing the old drug the standard normal distributi on

d C Z pq power It is not clear how should estimated. The difficulty lies in the fact that this depends on the exact situation at hand. We illustrate that through some examples

EXAMPLE 1 Assume that our observations follow exp( under the old drug (control and exp(exp under the new drug. Then the ratio of medians of the two distributions will be M M c d ln 1 e ln To be able to discover a 50% increase in the median we take M M d c 1 e 3 e e 3

1. It is possible to dress a table specifying as a function of the remaing parameters when =0.05 and p=q and the test is two sided.. Observe that e and e lead to the same number of events. 3. The largest power is obtained when p=q. 4. When p and q are not equal divide the table value by 4pq. 5. For a one sided test, only C needs to be changed from 1.96 to 1.64.

e 1.15 1.5 1.50 1.75.00 0.5 787 309 94 49 3 power 0.7 0.8 164 1607 496 631 150 191 79 100 51 65 0.9 15 844 56 134 88 The number of events needed for a certain power Under equal allocation.

Crossover Designs In this type of studies we are interested in testing a null hypothesis of the form H 0 : T = P against H a : T P. Placebo Period j=1 Arm k=1 T11 Period j=1 Period j= Drug Placebo Drug Period j= Arm k=1 T1 Arm k= Arm k= T1 T Crossover design

A general model we can use for the response of the ith individual in the kth arm at the jth period is where S and e are independent random effects with means zero and standard deviations and S. The above hypotheses can be tested using the statistic Td, which can also be used to determine the power of the test against a certain hypothesis of the type

Assuming equal sample sizes in both arms, and that d can be estimated from old studies we have (* where stands for the clinically meaningful difference. (* can now be sued to determine n.

When the sample size in each sequence group is 10, (a total sample size of 0 a x crossover design will have 80% power to detect a difference in means of, assuming that the Crossover ANOVA sqrt(mse is,11 (the Standard deviation of differences, σ, is 3 using a two group t-test (Crossover ANOVA with a 0,05 two-sided significance level.

Software For common situations, software is available Software available for purchase NQuery StudySize PASS Power and Precision Etc.. Online free software available on web search http://biostat.mc.vanderbilt.edu/twiki/bin/view/main/powersamplesize http://calculators.stat.ucla.edu http://hedwig.mgh.harvard.edu/sample_size/size.html