a Sample By:Dr.Hoseyn Falahzadeh 1

Size: px
Start display at page:

Download "a Sample By:Dr.Hoseyn Falahzadeh 1"

Transcription

1 In the name of God Determining ee the esize eof a Sample By:Dr.Hoseyn Falahzadeh 1

2 Sample Accuracy Sample accuracy: refers to how close a random sample s statistic is to the true population s value it represents Important points: Sample size is not related to representativeness Sample size is related to accuracy Ch 13 By:Dr.Hoseyn Falahzadeh 2

3 Sample Size and Accuracy Intuition: Which is more accurate: a large probability sample or a small probability sample? The larger a probability sample is, the more accurate it is (less sample error). Ch 13 By:Dr.Hoseyn Falahzadeh 3

4 ± Sample Size and Accuracy Accuracy 16% 14% 12% 10% 8% 6% 4% 2% 0% n = 1,450 4% - 2% = ±2% Sample Size Ch 13 By:Dr.Hoseyn Falahzadeh 4 Probability sample accuracy (error) can be calculated with a simple formula, and expressed as a ± % number.

5 Sample Size Formula Fortunately, statisticians have given us a formula which is based upon these relationships. The formula requires that we Specify the amount of confidence we wish Estimate the variance a in the population o Specify the amount of desired accuracy we want. When we specify the above, the formula tells us what sample we need to use n Ch 13 By:Dr.Hoseyn Falahzadeh 5

6 Sample Size and Population Size Where is N (size of the population) in the sample size determination formula? Population Size e=±3% Sample Size e=±4% Sample Size 10,000 1, ,000 1, ,000,000 1, ,000, , In almost all cases, the accuracy (sample error) of a probability sample is independent of the size of the population. Ch 13 By:Dr.Hoseyn Falahzadeh 6

7 Sample Size Formula Standard sample size formula for estimating a percentage: Ch 13 By:Dr.Hoseyn Falahzadeh 7

8 Practical Considerations in Sample Size Determination How to estimate variability (p times q) in the population p Expect the worst cast (p=50; q=50) Estimate t variability: Previous studies? Conduct a pilot study? Ch 13 By:Dr.Hoseyn Falahzadeh 8

9 Practical Considerations in Sample Size Determination How to determine the amount of desired sample error Convention is + or 5% The more important t the decision, i the more (smaller number) the sample error. Ch 13 By:Dr.Hoseyn Falahzadeh 9

10 Practical Considerations in Sample Size Determination How to decide on the level of confidence desired The more confidence, the larger the sample size. Convention is 95% (z=1.96) The more important the decision, the more likely the manager will want more confidence. 99% confidence, z=2.58. Ch 13 By:Dr.Hoseyn Falahzadeh 10

11 Example Estimating a Percentage in the Population What is the required sample size? Five years ago a survey showed that 42% of consumers were aware of the company s brand (Consumers were either aware or not aware ) After an intense ad campaign, management wants to conduct another survey and dthey want tto be 95% confident that the survey estimate will be within ±5% of the true percentage of aware consumers in the population. What is n? Ch 13 By:Dr.Hoseyn Falahzadeh 11

12 Estimating a Percentage: What is n? Z=1.96 (95% confidence) p=42 q=100-p=58 e=5 What is n? Ch 13 By:Dr.Hoseyn Falahzadeh 12

13 Estimating a Mean Estimating a mean requires a different formula (See MRI 13.2,,p. 378) Z is determined the same way (1.96 or 2.58) E is expressed in terms of the units we are estimating (i.e., if we are measuring attitudes on a 1-7 scale, we may want error to be no more than ±.5 scale units S is a little more difficult to estimate Ch 13 By:Dr.Hoseyn Falahzadeh 13

14 Estimating s Since we are estimating a mean, we can assume that our data are either interval or ratio. When we have interval or ratio data, the standard deviation, s, may be used as a measure of variance. Ch 13 By:Dr.Hoseyn Falahzadeh 14

15 Estimating s How to estimate s? Use standard deviation from a previous study on the target population. p Conduct a pilot study of a few members of the target population and calculate s. Estimate the range the value you are estimating can take on (minimum and maximum value) and divide id the range Ch 13 By:Dr.Hoseyn Falahzadeh 15 by 6.

16 Estimating s Why divide the range by 6? The range covers the entire distribution and ± 3 (or 6) standard deviations cover 99.9% of the area under the normal curve. Since we are estimating one standard deviation, we divide the range by 6. Ch 13 By:Dr.Hoseyn Falahzadeh 16

17 Practice Example A client wants to survey out-shopping intentions (percentage of people p saying yes to a question regarding their intentions to out-shop) among heads of households in Antigonish. The client wants a ± 3%, 19 times out of 20. There are 3,000 households in the catchment area. What sample size should be used? Ch 13 By:Dr.Hoseyn Falahzadeh 17

18 Sample size Considerations Needed for Two Independent Groups Ch 13 By:Dr.Hoseyn Falahzadeh 18

19 7 ingredients for sample size calculations 1. Research question to be answered 2. Outcome measure 3. Effect size 4. Variability & success proportions 1. For continuous outcome 2. For binary outcome 5. Type I error 6. Type II error 7. Other factors Ch 13 By:Dr.Hoseyn Falahzadeh 19

20 Further explanations of ingredient 1 Research question to be answered Translate the question into a clear hypothesis! For example, H0: there is no difference between treatment and control H1: there are differences between treatment and control Hypothesis Statistical results Conclusion statistically ti ti significant ifi result (that t is, p<0.05) 05) enough evidence to reject H0 accept H1 statistically non-significant result (that is, p>0.05) 05) no evidence to reject H0 Ch 13 By:Dr.Hoseyn Falahzadeh 20

21 Further explanations of ingredient 2 Outcome measures Should only have one primary outcome measure per study! Could have a secondary outcome measure, but we can only sample sizing/powering for the primary outcome May not have enough power for any results relating to the secondary outcome Recall the two types of variables: Continuous Categorical If the variable has 2 categories Binary Ch 13 By:Dr.Hoseyn Falahzadeh 21

22 Further explanations of ingredient 3 Effect Size (d) from the word difference The magnitude of difference that we are looking for Clinically important difference For 2 treatment arms: difference in means if continuous outcome difference in success proportions if binary outcome Minimum value worth detecting Decide what the minimum better means by looking at the endpoint and by considering background noise Headache? or Moderate & severe headache? or Migraine? Values could be found in previous literatures if they were doing similar study or can be estimated base on clinical experience but make sure it is reasonable (Remember GIGO!) Ch 13 By:Dr.Hoseyn Falahzadeh 22

23 Further explanations of ingredient 3 Effect Size (d) Example: In previous study, morbidity of a certain illness under conventional care is known to be 73% Interested in reducing morbidity to 50% (clinically important) Therefore the effect size is 23% A difference between these morbidities Example: Summarising all the studies with similar setting and characteristics regarding to a specific outcome measure, e.g. pain relief The overall response rate on Placebo is 32% The overall response rate on Active is 50% The overall estimate of the difference between Active and Placebo is 18% Of all the differences that are found in these studies, the smallest difference observed is 12% Could be the minimum value worth detecting Ch 13 By:Dr.Hoseyn Falahzadeh 23

24 Further explanations of ingredient 4.1 Variability (σ) pronounce as Sigma For continuous outcome only! Standard deviation (σ) or variance (σ 2 ) represents the spread of the distribution ib ti of a continuous variable Values can usually be found in previous literatures t or can be estimated base on clinical experience but make sure it is reasonable (GIGO!) Ch 13 By:Dr.Hoseyn Falahzadeh 24

25 Pooled standard deviation If there are several studies with variance estimates t available it is recommended that an overall estimate of the population variance or the pooled variance estimates, σ 2 p, is obtained from the following formula σ k 2 df iσ i s = 1 df 1σ 1 + df 2 σ 2 + L + df k σ k p = = k df 1 + df 2 + K + df k df i s = 1 where k is the number of studies, σ 2 i is the variance estimate from the i th study and df i is the degrees of freedom about this variance (which is the corresponding number of observations in the group minus 1, i.e. (n i -1)). Ch 13 By:Dr.Hoseyn Falahzadeh 25

26 Pooled standard deviation Example: The following descriptive statistics (number of subjects, mean ± standard deviation) of an outcome measure for each treatment arm were reported, Treatment A: n A = 83, mean A ± σ A = ± Treatment B: n B = 87, mean B ± σ B = ± Using the formula above, the pooled variance (σ 2 p ) and the pooled SD (σ p )i is Pooled Pooled 2 variance = σ p = SD 2 = σ p = σ p 2 ( 83 1) ( 87 1) ( 83 1) + ( 87 1) = = = Ch 13 By:Dr.Hoseyn Falahzadeh 26

27 Further explanations of ingredient 4.2 Slide - 27 Success proportions (p) For binary outcome only! Normally concerning Cured/Not Cured, Alive/Dead etc Alive Dead Total Success proportion o Treatment A a b n A p A = a / n A Treatment B c d n B p B = c / n B Require to know the success proportion of the binary outcome for each group or treatment arm first, can be found in previous literatures or estimate with clinical experience In the above table, suppose we are interested in the proportion of Alive, then the success proportions in each treatment are p A and p B for treatment A and B respectively Denote p is the average success proportion, i.e. (p A + p B )/2 We can use these information to find out the effect size and the standard deviation The effect size is the difference of the two success proportions, i.e. p A - p B The estimated standard deviation is p 100 p, where p is between 0 and 100 ( ) By:Dr.Hoseyn Falahzadeh

28 Further explanations of ingredients 5 & 6 Type I error (α) & Type II error (β) You should have heard these mentioned in the Hypothesis Testing session, hence this is just a reminder Slide - 28 Due to the fact that we are sampling from a population Uncertainty is introduced Quality of the sample will have an impact on our conclusion Error does exist There are two types of error: Type I error (α): observed something in our sample but not exist in the population (the truth) e.g. drinking water leads to cancer Type II error (β): observed nothing in our sample but something exist in the population (the truth) e.g. smoking doesn t lead to cancer By:Dr.Hoseyn Falahzadeh

29 Further explanations of ingredients 5 & 6 Type I error (α) & Type II error (β) Slide - 29 No Observed Difference Observed Difference No True Difference Well Designed Trial (1-α) Type I Error (α) True Difference Type II Error (β) Well Powered Trial (1-β) Type I error (α): usually allow for 5% Significant level = α cut-off point for p-value, i.e By:Dr.Hoseyn Falahzadeh

30 Further explanations of ingredients 5 & 6 Type I error (α) & Type II error (β) Slide - 30 No Observed Difference No True Difference Well Designed Trial (1-α) True Difference Type II Error (β) Observed Type I Error Well Powered Difference (α) Trial (1-β) Type II error (β): usually allow for 10% or 20%, more than Type I error (since Type I error is referred as society risk and hence more crucial to pharmaceutical company financially) Power of the study = 1-Type II error = 1-β, usually use 80% or 90%, the probability of detecting a difference in our study if there is one in the whole population By:Dr.Hoseyn Falahzadeh

31 Further explanations of ingredient 7 Other factors Slide - 31 Calculated sample size meaning the number of subjects required during the analysis, not the number to start with for recruiting subjects, if you want to detect a certain effect size with a specific significance and power Study design: Response rate: data gathering affect the response rate, e.g. about 50% response rate by postal questionnaire Drop-out rate: due to following subjects for a long period of time, e.g. cohort study, usually 20% - 25% Can increase the sample size by a suitable percentage to allow for these problems for example, increase calculated sample size (n) by 25% n n Final sample size ( N) = =, NOT n n ( ) or By:Dr.Hoseyn Falahzadeh

32 Formula for 2 independent groups From the 7 ingredients, there are 4 crucial factors involve in the actual sample size calculation Slide Effect size (d): the size of the difference we want to be able to detect 2. Variability (σ) or( p ( 100 p ) ): the standard deviation of the continuous outcome or the estimation for the binary outcome 3. Level of significance (α): the risk of a Type I error we will accept 4. Power (1-β): the risk of a Type II error we will accept By:Dr.Hoseyn Falahzadeh

33 Formula for 2 independent groups We use these 4 factors to generalise a formula to calculate sample size for 2 groups with continuous or binary outcome Slide - 33 The formula is: n ( per group ) = 2 [ z + z ] 2 (1 α / 2) 2 (1 β ) where is the standardised effect size i.e. effect size / variability = d/σ for continuous outcome = ( p p ) p ( 100 p ) A B for binary outcome By:Dr.Hoseyn Falahzadeh

34 What is z-score? Slide - 34 z-score Z-score is the number of standard deviations above/below the mean. z = (x µ)/σ By:Dr.Hoseyn Falahzadeh

35 What is z (1-α/2) and z (1-β)? z (1-α/2) is a value from the Normal distribution relating to significance level If the level of significance is set to 5%, then α = 0.05 For 2-sided test, z (1-α/2) = z = If the level of significance is set to 1%, then α = 0.01 For 2-sided test, z (1-α/2) = z = z (1-β) is a value from the Normal distribution relating to power If β is set to 10%, then the power is 90%, so 1- β = 0.90 For 1-sided test, z (1-β) = z 0.90 = If β is set to 20%, then the power is 80%, so 1- β = 0.80 For 1-sided test, z (1-β) = z 0.80 = Slide - 35 By:Dr.Hoseyn Falahzadeh

36 Table of z-scores Slide - 36 z-score By:Dr.Hoseyn Falahzadeh

37 The quick formula Slide - 37 We can pre-calculate [z (1-α/2) + z (1-β) ] 2, and call this k, using the relevant z-scores provided in the table from the previous slide for different combination of level of significance α and power 1-β, the formula then becomes n (per group) = 2k/ 2 where is effect size / variability = d/σ for continuous outcome = ( pa pb ) p ( 100 p ) for binary outcome k β = 0.10 β = 0.20 (90% Power) (80% Power) α = α = Remember to multiply the calculated sample size (n) by 2 to allow for 2 groups! Always round up your final sample size By:Dr.Hoseyn Falahzadeh

38 Even simpler! Slide - 38 For 5% significance level and power of 80% n = 2 (2 7.85)/ 2 32/ 2 (Total for 2 groups) For 1% significance level and power of 90% n = 2 ( )/ 2 60/ 2 (Total for 2 groups) A sample size of n within two groups will have 80% (and 90% respectively) power to detect the standardised effect size, and that the test will be performed at the 5% (and 1% respectively) significance level (two-sided). Note that =δ/σ, hence the required sample size increases as σ increases, or as δ decreases. By:Dr.Hoseyn Falahzadeh

39 The 4 factors & sample size Referring to the quick formula, we can predict the effect on the sample size if we increase/decrease the value of each of the 4 factors If the level of significance (α) decrease, e.g. from 5% to 1% Slide - 39 sample size increase If Type II error rate (β) decrease, power (1- β) increase, e.g. from 80% to 90% sample size increase If the effect size (d) decrease, e.g. detecting a smaller difference between the 2 groups sample size increase If the variability (σ) decrease, e.g. assuming the outcome measure has a smaller spread or less vary sample size decrease By:Dr.Hoseyn Falahzadeh

40 with continuous outcome Example: Differences between means Slide - 40 In a trial to compare the effects of two oral contraceptives on blood pressure (over one year), it is anticipated that one drug will increase diastolic blood pressure by 3mmHg, and the other will not change it. The standard deviation (of the changes in blood pressure) in both groups is expected to be 10mmHg. How many patients are required for this difference to be significant at the 5% level (with 80% power)? n = = women per group (3 /10 ) and a total of 350 women need to be recruited. By:Dr.Hoseyn Falahzadeh

41 with binary outcome Example: Difference between proportions Slide - 41 In a randomised clinical trial, the placebo response is anticipated to be 25%, and the active treatment response 65%. How many patients are needed if a two-sided test at the 1% level is planned, and a power of 90% is required? n = = = [(25 65)/ 45 (100 45)] so n=47 per group and a total of 94 patients are needed for this study. By:Dr.Hoseyn Falahzadeh

Welcome! Webinar Biostatistics: sample size & power. Thursday, April 26, 12:30 1:30 pm (NDT)

Welcome! Webinar Biostatistics: sample size & power. Thursday, April 26, 12:30 1:30 pm (NDT) . Welcome! Webinar Biostatistics: sample size & power Thursday, April 26, 12:30 1:30 pm (NDT) Get started now: Please check if your speakers are working and mute your audio. Please use the chat box to

More information

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary

The t-distribution. Patrick Breheny. October 13. z tests The χ 2 -distribution The t-distribution Summary Patrick Breheny October 13 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction Introduction What s wrong with z-tests? So far we ve (thoroughly!) discussed how to carry out hypothesis

More information

Session 9 Power and sample size

Session 9 Power and sample size Session 9 Power and sample size 9.1 Measure of the treatment difference 9.2 The power requirement 9.3 Application to a proportional odds analysis 9.4 Limitations and alternative approaches 9.5 Sample size

More information

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH

PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH PubH 7470: STATISTICS FOR TRANSLATIONAL & CLINICAL RESEARCH The First Step: SAMPLE SIZE DETERMINATION THE ULTIMATE GOAL The most important, ultimate step of any of clinical research is to do draw inferences;

More information

Chapter 2. Mean and Standard Deviation

Chapter 2. Mean and Standard Deviation Chapter 2. Mean and Standard Deviation The median is known as a measure of location; that is, it tells us where the data are. As stated in, we do not need to know all the exact values to calculate the

More information

We need to define some concepts that are used in experiments.

We need to define some concepts that are used in experiments. Chapter 0 Analysis of Variance (a.k.a. Designing and Analysing Experiments) Section 0. Introduction In Chapter we mentioned some different ways in which we could get data: Surveys, Observational Studies,

More information

Inference for Distributions Inference for the Mean of a Population

Inference for Distributions Inference for the Mean of a Population Inference for Distributions Inference for the Mean of a Population PBS Chapter 7.1 009 W.H Freeman and Company Objectives (PBS Chapter 7.1) Inference for the mean of a population The t distributions The

More information

Chapter 9. Inferences from Two Samples. Objective. Notation. Section 9.2. Definition. Notation. q = 1 p. Inferences About Two Proportions

Chapter 9. Inferences from Two Samples. Objective. Notation. Section 9.2. Definition. Notation. q = 1 p. Inferences About Two Proportions Chapter 9 Inferences from Two Samples 9. Inferences About Two Proportions 9.3 Inferences About Two s (Independent) 9.4 Inferences About Two s (Matched Pairs) 9.5 Comparing Variation in Two Samples Objective

More information

Sample Size and Power I: Binary Outcomes. James Ware, PhD Harvard School of Public Health Boston, MA

Sample Size and Power I: Binary Outcomes. James Ware, PhD Harvard School of Public Health Boston, MA Sample Size and Power I: Binary Outcomes James Ware, PhD Harvard School of Public Health Boston, MA Sample Size and Power Principles: Sample size calculations are an essential part of study design Consider

More information

STA Module 10 Comparing Two Proportions

STA Module 10 Comparing Two Proportions STA 2023 Module 10 Comparing Two Proportions Learning Objectives Upon completing this module, you should be able to: 1. Perform large-sample inferences (hypothesis test and confidence intervals) to compare

More information

Acknowledge error Smaller samples, less spread

Acknowledge error Smaller samples, less spread Hypothesis Testing with t Tests Al Arlo Clark-Foos kf Using Samples to Estimate Population Parameters Acknowledge error Smaller samples, less spread s = Σ ( X M N 1 ) 2 The t Statistic Indicates the distance

More information

Sample Size Calculations

Sample Size Calculations Sample Size Calculations Analyses rely on means rather than individual values Means are more precise Precision measured by σ n So precision depends on n This can be used, directly or indirectly, as a basis

More information

Sampling and Sample Size. Shawn Cole Harvard Business School

Sampling and Sample Size. Shawn Cole Harvard Business School Sampling and Sample Size Shawn Cole Harvard Business School Calculating Sample Size Effect Size Power Significance Level Variance ICC EffectSize 2 ( ) 1 σ = t( 1 κ ) + tα * * 1+ ρ( m 1) P N ( 1 P) Proportion

More information

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institution of Technology, Kharagpur Lecture No. # 36 Sampling Distribution and Parameter Estimation

More information

Two-Sample Inferential Statistics

Two-Sample Inferential Statistics The t Test for Two Independent Samples 1 Two-Sample Inferential Statistics In an experiment there are two or more conditions One condition is often called the control condition in which the treatment is

More information

Sampling Distributions: Central Limit Theorem

Sampling Distributions: Central Limit Theorem Review for Exam 2 Sampling Distributions: Central Limit Theorem Conceptually, we can break up the theorem into three parts: 1. The mean (µ M ) of a population of sample means (M) is equal to the mean (µ)

More information

Question. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not?

Question. Hypothesis testing. Example. Answer: hypothesis. Test: true or not? Question. Average is not the mean! μ average. Random deviation or not? Hypothesis testing Question Very frequently: what is the possible value of μ? Sample: we know only the average! μ average. Random deviation or not? Standard error: the measure of the random deviation.

More information

Probability and Probability Distributions. Dr. Mohammed Alahmed

Probability and Probability Distributions. Dr. Mohammed Alahmed Probability and Probability Distributions 1 Probability and Probability Distributions Usually we want to do more with data than just describing them! We might want to test certain specific inferences about

More information

Chapter Seven: Multi-Sample Methods 1/52

Chapter Seven: Multi-Sample Methods 1/52 Chapter Seven: Multi-Sample Methods 1/52 7.1 Introduction 2/52 Introduction The independent samples t test and the independent samples Z test for a difference between proportions are designed to analyze

More information

Chapter 20 Comparing Groups

Chapter 20 Comparing Groups Chapter 20 Comparing Groups Comparing Proportions Example Researchers want to test the effect of a new anti-anxiety medication. In clinical testing, 64 of 200 people taking the medicine reported symptoms

More information

Sample size re-estimation in clinical trials. Dealing with those unknowns. Chris Jennison. University of Kyoto, January 2018

Sample size re-estimation in clinical trials. Dealing with those unknowns. Chris Jennison. University of Kyoto, January 2018 Sample Size Re-estimation in Clinical Trials: Dealing with those unknowns Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj University of Kyoto,

More information

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds

Chapter 6. Logistic Regression. 6.1 A linear model for the log odds Chapter 6 Logistic Regression In logistic regression, there is a categorical response variables, often coded 1=Yes and 0=No. Many important phenomena fit this framework. The patient survives the operation,

More information

6 Sample Size Calculations

6 Sample Size Calculations 6 Sample Size Calculations A major responsibility of a statistician: sample size calculation. Hypothesis Testing: compare treatment 1 (new treatment) to treatment 2 (standard treatment); Assume continuous

More information

Answer keys for Assignment 10: Measurement of study variables (The correct answer is underlined in bold text)

Answer keys for Assignment 10: Measurement of study variables (The correct answer is underlined in bold text) Answer keys for Assignment 10: Measurement of study variables (The correct answer is underlined in bold text) 1. A quick and easy indicator of dispersion is a. Arithmetic mean b. Variance c. Standard deviation

More information

The Difference in Proportions Test

The Difference in Proportions Test Overview The Difference in Proportions Test Dr Tom Ilvento Department of Food and Resource Economics A Difference of Proportions test is based on large sample only Same strategy as for the mean We calculate

More information

CBA4 is live in practice mode this week exam mode from Saturday!

CBA4 is live in practice mode this week exam mode from Saturday! Announcements CBA4 is live in practice mode this week exam mode from Saturday! Material covered: Confidence intervals (both cases) 1 sample hypothesis tests (both cases) Hypothesis tests for 2 means as

More information

Harvard University. Rigorous Research in Engineering Education

Harvard University. Rigorous Research in Engineering Education Statistical Inference Kari Lock Harvard University Department of Statistics Rigorous Research in Engineering Education 12/3/09 Statistical Inference You have a sample and want to use the data collected

More information

Power and Sample Size Bios 662

Power and Sample Size Bios 662 Power and Sample Size Bios 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-10-31 14:06 BIOS 662 1 Power and Sample Size Outline Introduction One sample: continuous

More information

Statistics for IT Managers

Statistics for IT Managers Statistics for IT Managers 95-796, Fall 2012 Module 2: Hypothesis Testing and Statistical Inference (5 lectures) Reading: Statistics for Business and Economics, Ch. 5-7 Confidence intervals Given the sample

More information

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur

Fundamentals to Biostatistics. Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Fundamentals to Biostatistics Prof. Chandan Chakraborty Associate Professor School of Medical Science & Technology IIT Kharagpur Statistics collection, analysis, interpretation of data development of new

More information

ANOVA - analysis of variance - used to compare the means of several populations.

ANOVA - analysis of variance - used to compare the means of several populations. 12.1 One-Way Analysis of Variance ANOVA - analysis of variance - used to compare the means of several populations. Assumptions for One-Way ANOVA: 1. Independent samples are taken using a randomized design.

More information

Originality in the Arts and Sciences: Lecture 2: Probability and Statistics

Originality in the Arts and Sciences: Lecture 2: Probability and Statistics Originality in the Arts and Sciences: Lecture 2: Probability and Statistics Let s face it. Statistics has a really bad reputation. Why? 1. It is boring. 2. It doesn t make a lot of sense. Actually, the

More information

Math 124: Modules Overall Goal. Point Estimations. Interval Estimation. Math 124: Modules Overall Goal.

Math 124: Modules Overall Goal. Point Estimations. Interval Estimation. Math 124: Modules Overall Goal. What we will do today s David Meredith Department of Mathematics San Francisco State University October 22, 2009 s 1 2 s 3 What is a? Decision support Political decisions s s Goal of statistics: optimize

More information

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă HYPOTHESIS TESTING II TESTS ON MEANS Sorana D. Bolboacă OBJECTIVES Significance value vs p value Parametric vs non parametric tests Tests on means: 1 Dec 14 2 SIGNIFICANCE LEVEL VS. p VALUE Materials and

More information

CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity

CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity CHL 5225H Advanced Statistical Methods for Clinical Trials: Multiplicity Prof. Kevin E. Thorpe Dept. of Public Health Sciences University of Toronto Objectives 1. Be able to distinguish among the various

More information

One-Sample and Two-Sample Means Tests

One-Sample and Two-Sample Means Tests One-Sample and Two-Sample Means Tests 1 Sample t Test The 1 sample t test allows us to determine whether the mean of a sample data set is different than a known value. Used when the population variance

More information

Chapter 11. Correlation and Regression

Chapter 11. Correlation and Regression Chapter 11. Correlation and Regression The word correlation is used in everyday life to denote some form of association. We might say that we have noticed a correlation between foggy days and attacks of

More information

Confidence intervals CE 311S

Confidence intervals CE 311S CE 311S PREVIEW OF STATISTICS The first part of the class was about probability. P(H) = 0.5 P(T) = 0.5 HTTHHTTTTHHTHTHH If we know how a random process works, what will we see in the field? Preview of

More information

Sample Size Determination

Sample Size Determination Sample Size Determination 018 The number of subjects in a clinical study should always be large enough to provide a reliable answer to the question(s addressed. The sample size is usually determined by

More information

Statistics: revision

Statistics: revision NST 1B Experimental Psychology Statistics practical 5 Statistics: revision Rudolf Cardinal & Mike Aitken 29 / 30 April 2004 Department of Experimental Psychology University of Cambridge Handouts: Answers

More information

Descriptive Statistics-I. Dr Mahmoud Alhussami

Descriptive Statistics-I. Dr Mahmoud Alhussami Descriptive Statistics-I Dr Mahmoud Alhussami Biostatistics What is the biostatistics? A branch of applied math. that deals with collecting, organizing and interpreting data using well-defined procedures.

More information

Mathematical Notation Math Introduction to Applied Statistics

Mathematical Notation Math Introduction to Applied Statistics Mathematical Notation Math 113 - Introduction to Applied Statistics Name : Use Word or WordPerfect to recreate the following documents. Each article is worth 10 points and should be emailed to the instructor

More information

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke

BIOL 51A - Biostatistics 1 1. Lecture 1: Intro to Biostatistics. Smoking: hazardous? FEV (l) Smoke BIOL 51A - Biostatistics 1 1 Lecture 1: Intro to Biostatistics Smoking: hazardous? FEV (l) 1 2 3 4 5 No Yes Smoke BIOL 51A - Biostatistics 1 2 Box Plot a.k.a box-and-whisker diagram or candlestick chart

More information

Comparing Several Means: ANOVA

Comparing Several Means: ANOVA Comparing Several Means: ANOVA Understand the basic principles of ANOVA Why it is done? What it tells us? Theory of one way independent ANOVA Following up an ANOVA: Planned contrasts/comparisons Choosing

More information

Econ 325: Introduction to Empirical Economics

Econ 325: Introduction to Empirical Economics Econ 325: Introduction to Empirical Economics Chapter 9 Hypothesis Testing: Single Population Ch. 9-1 9.1 What is a Hypothesis? A hypothesis is a claim (assumption) about a population parameter: population

More information

Test 3 Practice Test A. NOTE: Ignore Q10 (not covered)

Test 3 Practice Test A. NOTE: Ignore Q10 (not covered) Test 3 Practice Test A NOTE: Ignore Q10 (not covered) MA 180/418 Midterm Test 3, Version A Fall 2010 Student Name (PRINT):............................................. Student Signature:...................................................

More information

Chapter 22. Comparing Two Proportions 1 /29

Chapter 22. Comparing Two Proportions 1 /29 Chapter 22 Comparing Two Proportions 1 /29 Homework p519 2, 4, 12, 13, 15, 17, 18, 19, 24 2 /29 Objective Students test null and alternate hypothesis about two population proportions. 3 /29 Comparing Two

More information

Ch. 7: Estimates and Sample Sizes

Ch. 7: Estimates and Sample Sizes Ch. 7: Estimates and Sample Sizes Section Title Notes Pages Introduction to the Chapter 2 2 Estimating p in the Binomial Distribution 2 5 3 Estimating a Population Mean: Sigma Known 6 9 4 Estimating a

More information

Sample Size. Vorasith Sornsrivichai, MD., FETP Epidemiology Unit, Faculty of Medicine Prince of Songkla University

Sample Size. Vorasith Sornsrivichai, MD., FETP Epidemiology Unit, Faculty of Medicine Prince of Songkla University Sample Size Vorasith Sornsrivichai, MD., FETP Epidemiology Unit, Faculty of Medicine Prince of Songkla University All nature is but art, unknown to thee; All chance, direction, which thou canst not see;

More information

Distribution of sample means

Distribution of sample means Two types of statistics: and Distribution of sample means Mean Standard deviation Population Sample The relationship between Population and Samples is described in terms of Probability A jar with 100 marbles,

More information

Hypothesis testing for µ:

Hypothesis testing for µ: University of California, Los Angeles Department of Statistics Statistics 10 Elements of a hypothesis test: Hypothesis testing Instructor: Nicolas Christou 1. Null hypothesis, H 0 (always =). 2. Alternative

More information

COGS 14B: INTRODUCTION TO STATISTICAL ANALYSIS

COGS 14B: INTRODUCTION TO STATISTICAL ANALYSIS COGS 14B: INTRODUCTION TO STATISTICAL ANALYSIS TA: Sai Chowdary Gullapally scgullap@eng.ucsd.edu Office Hours: Thursday (Mandeville) 3:30PM - 4:30PM (or by appointment) Slides: I am using the amazing slides

More information

HYPOTHESIS TESTING. Hypothesis Testing

HYPOTHESIS TESTING. Hypothesis Testing MBA 605 Business Analytics Don Conant, PhD. HYPOTHESIS TESTING Hypothesis testing involves making inferences about the nature of the population on the basis of observations of a sample drawn from the population.

More information

Contingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels.

Contingency Tables. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels. Contingency Tables Definition & Examples. Contingency tables are used when we want to looking at two (or more) factors. Each factor might have two more or levels. (Using more than two factors gets complicated,

More information

Sociology 6Z03 Review II

Sociology 6Z03 Review II Sociology 6Z03 Review II John Fox McMaster University Fall 2016 John Fox (McMaster University) Sociology 6Z03 Review II Fall 2016 1 / 35 Outline: Review II Probability Part I Sampling Distributions Probability

More information

Chapter 26: Comparing Counts (Chi Square)

Chapter 26: Comparing Counts (Chi Square) Chapter 6: Comparing Counts (Chi Square) We ve seen that you can turn a qualitative variable into a quantitative one (by counting the number of successes and failures), but that s a compromise it forces

More information

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College

ANOVA Situation The F Statistic Multiple Comparisons. 1-Way ANOVA MATH 143. Department of Mathematics and Statistics Calvin College 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College An example ANOVA situation Example (Treating Blisters) Subjects: 25 patients with blisters Treatments: Treatment A, Treatment

More information

Two sided, two sample t-tests. a) IQ = 100 b) Average height for men = c) Average number of white blood cells per cubic millimeter is 7,000.

Two sided, two sample t-tests. a) IQ = 100 b) Average height for men = c) Average number of white blood cells per cubic millimeter is 7,000. Two sided, two sample t-tests. I. Brief review: 1) We are interested in how a sample compares to some pre-conceived notion. For example: a) IQ = 100 b) Average height for men = 5 10. c) Average number

More information

16.400/453J Human Factors Engineering. Design of Experiments II

16.400/453J Human Factors Engineering. Design of Experiments II J Human Factors Engineering Design of Experiments II Review Experiment Design and Descriptive Statistics Research question, independent and dependent variables, histograms, box plots, etc. Inferential

More information

STAT 201 Assignment 6

STAT 201 Assignment 6 STAT 201 Assignment 6 Partial Solutions 12.1 Research question: Do parents in the school district support the new education program? Parameter: p = proportion of all parents in the school district who

More information

One-Way ANOVA Cohen Chapter 12 EDUC/PSY 6600

One-Way ANOVA Cohen Chapter 12 EDUC/PSY 6600 One-Way ANOVA Cohen Chapter 1 EDUC/PSY 6600 1 It is easy to lie with statistics. It is hard to tell the truth without statistics. -Andrejs Dunkels Motivating examples Dr. Vito randomly assigns 30 individuals

More information

PSY 305. Module 3. Page Title. Introduction to Hypothesis Testing Z-tests. Five steps in hypothesis testing

PSY 305. Module 3. Page Title. Introduction to Hypothesis Testing Z-tests. Five steps in hypothesis testing Page Title PSY 305 Module 3 Introduction to Hypothesis Testing Z-tests Five steps in hypothesis testing State the research and null hypothesis Determine characteristics of comparison distribution Five

More information

By Keith Chrzan, Division Vice President, Marketing Sciences Group, Maritz Research

By Keith Chrzan, Division Vice President, Marketing Sciences Group, Maritz Research Monte Carlo Forecasting: Safer than it Sounds By Keith Chrzan, Division Vice President, Marketing Sciences Group, Maritz Research What do oil well exploration, the Dow Jones and hurricane wind probabilities

More information

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras. Lecture 11 t- Tests

Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras. Lecture 11 t- Tests Biostatistics and Design of Experiments Prof. Mukesh Doble Department of Biotechnology Indian Institute of Technology, Madras Lecture 11 t- Tests Welcome to the course on Biostatistics and Design of Experiments.

More information

χ test statistics of 2.5? χ we see that: χ indicate agreement between the two sets of frequencies.

χ test statistics of 2.5? χ we see that: χ indicate agreement between the two sets of frequencies. I. T or F. (1 points each) 1. The χ -distribution is symmetric. F. The χ may be negative, zero, or positive F 3. The chi-square distribution is skewed to the right. T 4. The observed frequency of a cell

More information

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability?

Probability: Why do we care? Lecture 2: Probability and Distributions. Classical Definition. What is Probability? Probability: Why do we care? Lecture 2: Probability and Distributions Sandy Eckel seckel@jhsph.edu 22 April 2008 Probability helps us by: Allowing us to translate scientific questions into mathematical

More information

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS In our work on hypothesis testing, we used the value of a sample statistic to challenge an accepted value of a population parameter. We focused only

More information

Hypothesis T e T sting w ith with O ne O One-Way - ANOV ANO A V Statistics Arlo Clark Foos -

Hypothesis T e T sting w ith with O ne O One-Way - ANOV ANO A V Statistics Arlo Clark Foos - Hypothesis Testing with One-Way ANOVA Statistics Arlo Clark-Foos Conceptual Refresher 1. Standardized z distribution of scores and of means can be represented as percentile rankings. 2. t distribution

More information

Adaptive Designs: Why, How and When?

Adaptive Designs: Why, How and When? Adaptive Designs: Why, How and When? Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj ISBS Conference Shanghai, July 2008 1 Adaptive designs:

More information

Categorical Data Analysis 1

Categorical Data Analysis 1 Categorical Data Analysis 1 STA 312: Fall 2012 1 See last slide for copyright information. 1 / 1 Variables and Cases There are n cases (people, rats, factories, wolf packs) in a data set. A variable is

More information

Power and sample size calculations

Power and sample size calculations Power and sample size calculations Susanne Rosthøj Biostatistisk Afdeling Institut for Folkesundhedsvidenskab Københavns Universitet sr@biostat.ku.dk April 8, 2014 Planning an investigation How many individuals

More information

PubHlth 540 Estimation Page 1 of 69. Unit 6 Estimation

PubHlth 540 Estimation Page 1 of 69. Unit 6 Estimation PubHlth 540 Estimation Page 1 of 69 Unit 6 Estimation Topic 1. Introduction........ a. Goals of Estimation. b. Notation and Definitions. c. How to Interpret a Confidence Interval. Preliminaries: Some Useful

More information

The t-statistic. Student s t Test

The t-statistic. Student s t Test The t-statistic 1 Student s t Test When the population standard deviation is not known, you cannot use a z score hypothesis test Use Student s t test instead Student s t, or t test is, conceptually, very

More information

Section 6.2 Hypothesis Testing

Section 6.2 Hypothesis Testing Section 6.2 Hypothesis Testing GIVEN: an unknown parameter, and two mutually exclusive statements H 0 and H 1 about. The Statistician must decide either to accept H 0 or to accept H 1. This kind of problem

More information

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College

1-Way ANOVA MATH 143. Spring Department of Mathematics and Statistics Calvin College 1-Way ANOVA MATH 143 Department of Mathematics and Statistics Calvin College Spring 2010 The basic ANOVA situation Two variables: 1 Categorical, 1 Quantitative Main Question: Do the (means of) the quantitative

More information

2011 Pearson Education, Inc

2011 Pearson Education, Inc Statistics for Business and Economics Chapter 7 Inferences Based on Two Samples: Confidence Intervals & Tests of Hypotheses Content 1. Identifying the Target Parameter 2. Comparing Two Population Means:

More information

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878

Contingency Tables. Safety equipment in use Fatal Non-fatal Total. None 1, , ,128 Seat belt , ,878 Contingency Tables I. Definition & Examples. A) Contingency tables are tables where we are looking at two (or more - but we won t cover three or more way tables, it s way too complicated) factors, each

More information

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests:

One sided tests. An example of a two sided alternative is what we ve been using for our two sample tests: One sided tests So far all of our tests have been two sided. While this may be a bit easier to understand, this is often not the best way to do a hypothesis test. One simple thing that we can do to get

More information

Lecture 7: Confidence interval and Normal approximation

Lecture 7: Confidence interval and Normal approximation Lecture 7: Confidence interval and Normal approximation 26th of November 2015 Confidence interval 26th of November 2015 1 / 23 Random sample and uncertainty Example: we aim at estimating the average height

More information

Chapter 9. Hypothesis testing. 9.1 Introduction

Chapter 9. Hypothesis testing. 9.1 Introduction Chapter 9 Hypothesis testing 9.1 Introduction Confidence intervals are one of the two most common types of statistical inference. Use them when our goal is to estimate a population parameter. The second

More information

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology

Group Sequential Tests for Delayed Responses. Christopher Jennison. Lisa Hampson. Workshop on Special Topics on Sequential Methodology Group Sequential Tests for Delayed Responses Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj Lisa Hampson Department of Mathematics and Statistics,

More information

Chapter Six: Two Independent Samples Methods 1/51

Chapter Six: Two Independent Samples Methods 1/51 Chapter Six: Two Independent Samples Methods 1/51 6.3 Methods Related To Differences Between Proportions 2/51 Test For A Difference Between Proportions:Introduction Suppose a sampling distribution were

More information

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI

Module 03 Lecture 14 Inferential Statistics ANOVA and TOI Introduction of Data Analytics Prof. Nandan Sudarsanam and Prof. B Ravindran Department of Management Studies and Department of Computer Science and Engineering Indian Institute of Technology, Madras Module

More information

CHAPTER 5 LINEAR REGRESSION AND CORRELATION

CHAPTER 5 LINEAR REGRESSION AND CORRELATION CHAPTER 5 LINEAR REGRESSION AND CORRELATION Expected Outcomes Able to use simple and multiple linear regression analysis, and correlation. Able to conduct hypothesis testing for simple and multiple linear

More information

Optimising Group Sequential Designs. Decision Theory, Dynamic Programming. and Optimal Stopping

Optimising Group Sequential Designs. Decision Theory, Dynamic Programming. and Optimal Stopping : Decision Theory, Dynamic Programming and Optimal Stopping Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj InSPiRe Conference on Methodology

More information

Null Hypothesis Significance Testing p-values, significance level, power, t-tests Spring 2017

Null Hypothesis Significance Testing p-values, significance level, power, t-tests Spring 2017 Null Hypothesis Significance Testing p-values, significance level, power, t-tests 18.05 Spring 2017 Understand this figure f(x H 0 ) x reject H 0 don t reject H 0 reject H 0 x = test statistic f (x H 0

More information

Sleep data, two drugs Ch13.xls

Sleep data, two drugs Ch13.xls Model Based Statistics in Biology. Part IV. The General Linear Mixed Model.. Chapter 13.3 Fixed*Random Effects (Paired t-test) ReCap. Part I (Chapters 1,2,3,4), Part II (Ch 5, 6, 7) ReCap Part III (Ch

More information

Population 1 Population 2

Population 1 Population 2 Two Population Case Testing the Difference Between Two Population Means Sample of Size n _ Sample mean = x Sample s.d.=s x Sample of Size m _ Sample mean = y Sample s.d.=s y Pop n mean=μ x Pop n s.d.=

More information

AP Statistics Cumulative AP Exam Study Guide

AP Statistics Cumulative AP Exam Study Guide AP Statistics Cumulative AP Eam Study Guide Chapters & 3 - Graphs Statistics the science of collecting, analyzing, and drawing conclusions from data. Descriptive methods of organizing and summarizing statistics

More information

Power and the computation of sample size

Power and the computation of sample size 9 Power and the computation of sample size A statistical test will not be able to detect a true difference if the sample size is too small compared with the magnitude of the difference. When designing

More information

Hypothesis testing. Data to decisions

Hypothesis testing. Data to decisions Hypothesis testing Data to decisions The idea Null hypothesis: H 0 : the DGP/population has property P Under the null, a sample statistic has a known distribution If, under that that distribution, the

More information

Chapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc.

Chapter 24. Comparing Means. Copyright 2010 Pearson Education, Inc. Chapter 24 Comparing Means Copyright 2010 Pearson Education, Inc. Plot the Data The natural display for comparing two groups is boxplots of the data for the two groups, placed side-by-side. For example:

More information

Salt Lake Community College MATH 1040 Final Exam Fall Semester 2011 Form E

Salt Lake Community College MATH 1040 Final Exam Fall Semester 2011 Form E Salt Lake Community College MATH 1040 Final Exam Fall Semester 011 Form E Name Instructor Time Limit: 10 minutes Any hand-held calculator may be used. Computers, cell phones, or other communication devices

More information

Lab #12: Exam 3 Review Key

Lab #12: Exam 3 Review Key Psychological Statistics Practice Lab#1 Dr. M. Plonsky Page 1 of 7 Lab #1: Exam 3 Review Key 1) a. Probability - Refers to the likelihood that an event will occur. Ranges from 0 to 1. b. Sampling Distribution

More information

Unit 27 One-Way Analysis of Variance

Unit 27 One-Way Analysis of Variance Unit 27 One-Way Analysis of Variance Objectives: To perform the hypothesis test in a one-way analysis of variance for comparing more than two population means Recall that a two sample t test is applied

More information

Sample Size Calculations for Group Randomized Trials with Unequal Sample Sizes through Monte Carlo Simulations

Sample Size Calculations for Group Randomized Trials with Unequal Sample Sizes through Monte Carlo Simulations Sample Size Calculations for Group Randomized Trials with Unequal Sample Sizes through Monte Carlo Simulations Ben Brewer Duke University March 10, 2017 Introduction Group randomized trials (GRTs) are

More information

Correlation and Simple Linear Regression

Correlation and Simple Linear Regression Correlation and Simple Linear Regression Sasivimol Rattanasiri, Ph.D Section for Clinical Epidemiology and Biostatistics Ramathibodi Hospital, Mahidol University E-mail: sasivimol.rat@mahidol.ac.th 1 Outline

More information

One-sample categorical data: approximate inference

One-sample categorical data: approximate inference One-sample categorical data: approximate inference Patrick Breheny October 6 Patrick Breheny Biostatistical Methods I (BIOS 5710) 1/25 Introduction It is relatively easy to think about the distribution

More information

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data

Person-Time Data. Incidence. Cumulative Incidence: Example. Cumulative Incidence. Person-Time Data. Person-Time Data Person-Time Data CF Jeff Lin, MD., PhD. Incidence 1. Cumulative incidence (incidence proportion) 2. Incidence density (incidence rate) December 14, 2005 c Jeff Lin, MD., PhD. c Jeff Lin, MD., PhD. Person-Time

More information

The Design of a Survival Study

The Design of a Survival Study The Design of a Survival Study The design of survival studies are usually based on the logrank test, and sometimes assumes the exponential distribution. As in standard designs, the power depends on The

More information