E509A: Principle of Biostatistics. GY Zou

Similar documents
Power and Sample Size Bios 662

Statistics for IT Managers

BIO5312 Biostatistics Lecture 6: Statistical hypothesis testings

Outline. PubH 5450 Biostatistics I Prof. Carlin. Confidence Interval for the Mean. Part I. Reviews

Sampling Distributions: Central Limit Theorem

Frequency table: Var2 (Spreadsheet1) Count Cumulative Percent Cumulative From To. Percent <x<=

Gov 2000: 6. Hypothesis Testing

Basic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation).

Hypothesis testing. Data to decisions

E509A: Principle of Biostatistics. (Week 11(2): Introduction to non-parametric. methods ) GY Zou.

Hypothesis testing: Steps

Hypothesis testing: Steps

Testing Independence

General Linear Model (Chapter 4)

Medical statistics part I, autumn 2010: One sample test of hypothesis

PHP2510: Principles of Biostatistics & Data Analysis. Lecture X: Hypothesis testing. PHP 2510 Lec 10: Hypothesis testing 1

Chapter 7 Comparison of two independent samples

Epidemiology Wonders of Biostatistics Chapter 11 (continued) - probability in a single population. John Koval

Exam 2 (KEY) July 20, 2009

16.400/453J Human Factors Engineering. Design of Experiments II

Sociology Exam 2 Answer Key March 30, 2012

7.2 One-Sample Correlation ( = a) Introduction. Correlation analysis measures the strength and direction of association between

Chapter 10: STATISTICAL INFERENCE FOR TWO SAMPLES. Part 1: Hypothesis tests on a µ 1 µ 2 for independent groups

CSE 103 Homework 8: Solutions November 30, var(x) = np(1 p) = P r( X ) 0.95 P r( X ) 0.

HYPOTHESIS TESTING. Hypothesis Testing

Section 9.4. Notation. Requirements. Definition. Inferences About Two Means (Matched Pairs) Examples

An introduction to biostatistics: part 1

Epidemiology Principle of Biostatistics Chapter 11 - Inference about probability in a single population. John Koval

Multiple Sample Categorical Data

ECON Introductory Econometrics. Lecture 2: Review of Statistics

Psych 230. Psychological Measurement and Statistics

Epidemiology Principles of Biostatistics Chapter 10 - Inferences about two populations. John Koval

One-Way ANOVA. Some examples of when ANOVA would be appropriate include:

their contents. If the sample mean is 15.2 oz. and the sample standard deviation is 0.50 oz., find the 95% confidence interval of the true mean.

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

6 Single Sample Methods for a Location Parameter

Last week: Sample, population and sampling distributions finished with estimation & confidence intervals

BINF 702 SPRING Chapter 8 Hypothesis Testing: Two-Sample Inference. BINF702 SPRING 2014 Chapter 8 Hypothesis Testing: Two- Sample Inference 1

Relating Graph to Matlab

Chapter 5: HYPOTHESIS TESTING

Lecture 7: Hypothesis Testing and ANOVA

Two Sample Problems. Two sample problems

Last two weeks: Sample, population and sampling distributions finished with estimation & confidence intervals

Chapter 7. Inference for Distributions. Introduction to the Practice of STATISTICS SEVENTH. Moore / McCabe / Craig. Lecture Presentation Slides

Mock Exam - 2 hours - use of basic (non-programmable) calculator is allowed - all exercises carry the same marks - exam is strictly individual

Business Statistics. Lecture 10: Course Review

Welcome! Webinar Biostatistics: sample size & power. Thursday, April 26, 12:30 1:30 pm (NDT)

Chapter Six: Two Independent Samples Methods 1/51

Chapter 26: Comparing Counts (Chi Square)

Lecture 14: Introduction to Poisson Regression

Modelling counts. Lecture 14: Introduction to Poisson Regression. Overview

7 Estimation. 7.1 Population and Sample (P.91-92)

20 Hypothesis Testing, Part I

1 Descriptive statistics. 2 Scores and probability distributions. 3 Hypothesis testing and one-sample t-test. 4 More on t-tests

The Empirical Rule, z-scores, and the Rare Event Approach

STAT 430 (Fall 2017): Tutorial 2

Announcements. Unit 3: Foundations for inference Lecture 3: Decision errors, significance levels, sample size, and power.

EXST Regression Techniques Page 1. We can also test the hypothesis H :" œ 0 versus H :"

Statistics - Lecture 04

Statistics Introductory Correlation

Unit 1 Review of BIOSTATS 540 Practice Problems SOLUTIONS - Stata Users

UCLA STAT 251. Statistical Methods for the Life and Health Sciences. Hypothesis Testing. Instructor: Ivo Dinov,

9/2/2010. Wildlife Management is a very quantitative field of study. throughout this course and throughout your career.

Last few slides from last time

PubH 5450 Biostatistics I Prof. Carlin. Lecture 13

Hypothesis testing I. - In particular, we are talking about statistical hypotheses. [get everyone s finger length!] n =

Regression with a Single Regressor: Hypothesis Tests and Confidence Intervals

Psych 10 / Stats 60, Practice Problem Set 5 (Week 5 Material) Part 1: Power (and building blocks of power)

# of 6s # of times Test the null hypthesis that the dice are fair at α =.01 significance

Fin285a:Computer Simulations and Risk Assessment Section 2.3.2:Hypothesis testing, and Confidence Intervals

Hypothesis Testing. We normally talk about two types of hypothesis: the null hypothesis and the research or alternative hypothesis.

CBA4 is live in practice mode this week exam mode from Saturday!

ST505/S697R: Fall Homework 2 Solution.

Table of z values and probabilities for the standard normal distribution. z is the first column plus the top row. Each cell shows P(X z).

Sample size and power calculation using R and SAS proc power. Ho Kim GSPH, SNU

Business Statistics: Lecture 8: Introduction to Estimation & Hypothesis Testing

Relax and good luck! STP 231 Example EXAM #2. Instructor: Ela Jackiewicz

Paper Equivalence Tests. Fei Wang and John Amrhein, McDougall Scientific Ltd.

PLS205 Lab 2 January 15, Laboratory Topic 3

Basic Statistics. 1. Gross error analyst makes a gross mistake (misread balance or entered wrong value into calculation).

T.I.H.E. IT 233 Statistics and Probability: Sem. 1: 2013 ESTIMATION AND HYPOTHESIS TESTING OF TWO POPULATIONS

CHAPTER 17 CHI-SQUARE AND OTHER NONPARAMETRIC TESTS FROM: PAGANO, R. R. (2007)

Sample Size / Power Calculations

S o c i o l o g y E x a m 2 A n s w e r K e y - D R A F T M a r c h 2 7,

Chapter 23. Inference About Means

Chapter 7. Practice Exam Questions and Solutions for Final Exam, Spring 2009 Statistics 301, Professor Wardrop

ECON Introductory Econometrics. Lecture 5: OLS with One Regressor: Hypothesis Tests

First we look at some terms to be used in this section.

Basics on t-tests Independent Sample t-tests Single-Sample t-tests Summary of t-tests Multiple Tests, Effect Size Proportions. Statistiek I.

Quantitative Understanding in Biology 1.7 Bayesian Methods

HYPOTHESIS TESTING II TESTS ON MEANS. Sorana D. Bolboacă

Assignment 3 Logic and Reasoning KEY

An Analysis of College Algebra Exam Scores December 14, James D Jones Math Section 01

The Components of a Statistical Hypothesis Testing Problem

Lecture 3: Inference in SLR

Problem #1 #2 #3 #4 #5 #6 Total Points /6 /8 /14 /10 /8 /10 /56

Answer keys for Assignment 10: Measurement of study variables (The correct answer is underlined in bold text)

Single Sample Means. SOCY601 Alan Neustadtl

An inferential procedure to use sample data to understand a population Procedures

Discrete Multivariate Statistics

Transcription:

E509A: Principle of Biostatistics (Week 4: Inference for a single mean ) GY Zou gzou@srobarts.ca

Example 5.4. (p. 183). A random sample of n =16, Mean I.Q is 106 with standard deviation S =12.4. What is the 95% CI? 106 ± t 1.05/2,16 1 12.4/ 16 = (99.4, 112, 6) Example 5.5 (p. 184). A random sample of n =65, Mean number of visits over 3-year period is 16 with S =1.4. What is the 99% CI? 16 ± z 1.01/2 1.4/ 65 = (15.5, 16.5)

meaning of confidence interval For a confidence interval constructed based on a single sample, it either cover or not cover the true parameter. we are unable to predict th result of any single observation before we have made it, but we can predict, with very considerable accuracy, the result of a long series. (Weldon 1906). It is incorrect to say: There is a 95% probability that the estimated interval [a, b] contains the unknown μ, because [a, b] is changing from sample to sample, while μ is fixed. Image throw a horse shoe in a dark room. The (1 α)100 % is referring to if th study were to be repeated 100 times, of the 100 resulting (1 α)100 % confidence intervals, we would expect (1 α)100 of these to include the population parameter. Your samples from Framingham Study will show this.

Why not just construct a 100% confidence interval?

Sample size of estimating a population mean Mind set: sample size estimation is used to distinguish n and 3n, but not n and n +3.

Since CI is given by X ± Z 1 α/2 σ/ n The uncertainty is Z 1 α/2 σ/ n,let denoteitase, i.e. E = Z 1 α/2 S/ n Thus n = ( Z1 α/2 σ E ) 2 σ, E, must be given by the researcher: from literature, gut feeling, etc. sometimes use σ = range/4. The probability of achieving the target precision is only 50%.

Example 5.7. (p 188). Hospital administration wants to estimate the mean time it takes for patients to get from one dept to another. The margin of error is 5 minutes with 95%. How big a sample does it need? Do a pilot, if nothing to rely on. Here σ =17from a pilot, thus ( ) Z1 α/2 S 2 ( ) 1.96 17 2 n = = =44.4 45 E 5

options ls=64 nocenter; proc power; onesamplemeans ci=t alpha = 0.05 halfwidth = 5 stddev = 17 probwidth = 0.5 ntotal =.; run; probwidth = desired probability of achieving the target precision

The POWER Procedure Confidence Interval for Mean Fixed Scenario Elements Distribution Normal Method Exact Alpha 0.05 CI Half-Width 5 Standard Deviation 17 Nominal Prob(Width) 0.5 Number of Sides 2 Prob Type Conditional Computed N Total Actual Prob N (Width) Total 0.525 47

One after at Rothamstead, Ronald A Fisher poured a cup of tea and offered it to the women standing beside him. She refused, remarking that she preferred milk to be in the cup before the tea was added. Fisher could not believe that there could be any difference in the taste, and then a trial was conducted. The woman correctly identified more than enough of those cups into which tea had been poured first to prove her case.

Assume 10 cups of tea made without the woman knowing how they were made. The women correctly identified 9 cups. Did the women guess correctly? or there indeed a difference in taste? If the women was guessing, then 50/50 chance, which gives H 0 : p =0.5 Pr(X =9)= 10! 9!(10 9)! (0.5)9 (1 0.5) 10 9 =.0098 Another piece: would count if the women had identified all 10 Pr(X = 10) = 10! 10!(10 10)! (0.5)10 (1 0.5) 10 10 =.0010 Thus, if the women had been guessing, the probability of correctly identified 9 out of 10 just by chance is 0.0098 + 0.0010 = 0.0108 This probability, of getting the observed result or more extreme, is called p-value.

p =Pr( observed H 0 ) It is NOT Pr(H 0 observed)

Hypothesis testing Researchers always have some hypothesis. e.g., diabetes have raised BP, oral contraceptive may cause breast cancer, etc. Can we prove hypothesis? No, one can always think of cases which have not yet arisen. Thus we set out to disprove a hypothesis, this is what we call hypothesis testing.

Three steps: Choose a significance level, α, of the test (also called false positive error rate we are willing to accept); Pretend the null hypothesis is true (so we can have a distribution as benchmark) Conduct the study, observe the data and compute p-value; Compare p and α and make decision, reject H 0 or not reject H 0. α is selected before the begin p is calculated after the study.

What is the p-value? Suppose a study observed a test statistic of 2.05 and the p-value for testing H 0 : μ =0is 0.04. If we replicate the study 100 times, if H 0 is true, then 4 of these 100 studies we will have a statistic at least 2.05. In terms of conditional probability p =Pr(Data H 0 ) We know that Pr(Data H 0 ) Pr(H 0 Data) Therefore, p-value is NOT the probability of H 0 being true.

To repeat the message In research, most of time we collect evidence to against H 0 (just like a prosecutor in a trial), we do NOT prove H 0. When our p-value is larger than 5%, we say we do not have sufficient evidence to suggest H 1, but NEVER say we showed no effect or we proved H 0. Hartung et al. 1983 Absence of evidence is not evidence of absence. Anesthesiology 58: 298 300 Donald Rumsfeld knows this, so should you. See http://www.defenselink.mil/news/feb2002/t02122002_t212sdv2.html}

Example 5.9. (p.196). Population mean cholesterol for males age 50 years old is μ = 241. Wish to see if modified diet could reduce it. n =12people on the diet for 3 months. Set α =0.05 and H 0 = 241 versus H 1 : μ<241. X = 235 and S =12.5. Assuming cholesterol is normally distributed, T = x μ 0 s/ n t 12 1 T = 235 241 12.5/ = 1.66 >t 12 0.05,12 1 = 1.796 Do not reject H 0. p =0.063.

Example 5.10. (198). Male entry level salary is μ 0 = $29500. Wish to see if female entry salary is significant different from this. Take a sample of size 10 and the observations are 1000; 32 27 31 27 26 26 30 22 25 36 Set α =0.05 Assuming normal distribution, t = x μ 0 s/ n t 10 1 T = 28.2 29.5 16.4/ 10 = 1.02 If H 0 : μ =29.5 1000 is true, p =0.3366.

SAS program options nocenter ls=80 ps=100; data salary; input salary @@; cards; 32 27 31 27 26 26 30 22 25 36 ; proc print; proc ttest H0=29.5; run;

The SAS System 22:45 Saturday, September 24, 2005 6 The TTEST Procedure Statistics Lower CL Upper CL Lower CL Variable N Mean Mean Mean Std Dev Std Dev salary 10 25.303 28.2 31.097 2.7855 4.0497 Statistics Upper CL Variable Std Dev Std Err Minimum Maximum salary 7.3932 1.2806 22 36 T-Tests Variable DF t Value Pr > t salary 9-1.02 0.336

Sample size estimation For a two-sided test, n = ( Z1 α/2 + Z 1 β (μ 1 μ 0 )/σ ) 2, where 1 β is the power of the test, i.e, the probability of detecting a difference if such a difference does exist. For an one-sided test, n = ( Z1 α + Z 1 β (μ 1 μ 0 )/σ ) 2

Type I, Type II errors Truth Decision H 0 is true H 0 is not true Reject TypeIerror(α) Power (1 β) Don t reject Type II error (β) Is there a Type III error? Status Test result No disease (D ) Disease (D + ) T + T Pr(T + D + )=Sensitivity Pr(T D )=Specificity

Since we obtained Pr(D + T + ) (or Pr(D T )) using the knowledge of Pr(D), Pr(T + D + ) and Pr(T D ). Question: can we do the same here, i.e., can we obtain Pr(D T )? In other words, can we use data to obtain the probability of H 0 being true? Many people tried that, but

Example 5.14. (p. 210). Suppose we wish to conduct a study to test μ = 100 at a 5% level of significance and 80% power. A difference of 5 units would be worthwhile. σ =9.5 n = ( Z1 α/2 + Z 1 β (μ 1 μ 0 )/σ ) 2 = ( ) 2 1.96 + 0.84 =28.33 29 5/9.5 proc power; onesamplemeans nullm=100 mean = 105 ntotal =. stddev = 9.5 power =.80; run;

The POWER Procedure One-sample t Test for Mean Fixed Scenario Elements Distribution Normal Method Exact Null Mean 100 Mean 105 Standard Deviation 9.5 Nominal Power 0.8 Number of Sides 2 Alpha 0.05 Computed N Total Actual N Power Total 0.809 31

Power calculation For a two-sided test, 1 β =Pr ( Z 1 α/2 μ ) 1 μ 0 σ/ n For an one-sided test, 1 β =Pr ( Z 1 α μ ) 1 μ 0 σ/ n

Example 5.11 (p. 208). μ 0 =80and μ 1 =85. α =5%. Two-sided test. n =20and σ =9.5 Z 1 α/2 μ 1 μ 0 σ/ n =1.96 5 9.5/ 20 = 0.40 Pr(Z > 0.40) = 1 Pr(Z < 0.4) = 1 0.3446 = 0.6554

proc power; onesamplemeans nullm = 80 mean = 85 ntotal = 20 stddev = 9.5 power =.; run;

The POWER Procedure One-sample t Test for Mean Fixed Scenario Elements Distribution Normal Method Exact Null Mean 80 Mean 85 Standard Deviation 9.5 Total Sample Size 20 Number of Sides 2 Alpha 0.05 Computed Power Power 0.608

The relationship between confidence interval and hypothesis testing: If a (1 α/2) 100% confidence interval contains the null hypothesis value, then the 2-sided test does not reject the null hypothesis at the α level This means that one can read off the hypothesis testing results by looking at a confidence interval. Suppose your confidence interval for μ is (-0.2, 0.5), and you want to test H 0 : μ =0, don t reject H 0. It is also clear what is the conclusion of testing H 0 : μ =0.51. In fact, you can do infinite many tests with one confidence interval.

Look-up t-critical value (quantile) with known degree-of-freedom and probability, use crit = tinv(prob, df); Look-up probability with known degree-of-freedom and calculated test statistic, use prob =probt(tcal, df); data; prob =0.95; df =12; crit = tinv(prob, df); tcal =1.812; df1 = 10; prob1 = probt(tcal, df1); ; proc print; run; Obs prob df crit tcal df1 prob1 1 0.95 12 1.78229 1.812 10 0.94996

SAS program for Ex 5.7 proc power; onesamplemeans ci=t alpha =0.05 halfwidth=1 2 3 4 5 10 stddev = 17 probwidth =.50 ntotal =.; run;

The POWER Procedure Confidence Interval for Mean Fixed Scenario Elements Distribution Normal Method Exact Alpha 0.05 Standard Deviation 17 Nominal Prob(Width) 0.5 Number of Sides 2 Prob Type Conditional Computed N Total Actual Half- Prob N Index Width (Width) Total 1 1 0.507 1113 2 2 0.508 280 3 3 0.516 126 4 4 0.521 72 5 5 0.525 47 6 10 0.574 14 considering many scenarios.

SAS program for Ex 5.12 (p. 210) proc power; onesamplemeans nullmean = 100 mean = 105 103 sides=1 2 alpha=0.05 stddev = 9.5 power=.80.90 ntotal =.; run;

The POWER Procedure One-sample t Test for Mean Fixed Scenario Elements Distribution Normal Method Exact Null Mean 100 Alpha 0.05 Standard Deviation 9.5 Computed N Total Nominal Actual N Index Sides Mean Power Power Total 1 1 105 0.8 0.804 24 2 1 105 0.9 0.906 33 3 1 103 0.8 0.803 64 4 1 103 0.9 0.902 88 5 2 105 0.8 0.809 31 6 2 105 0.9 0.901 40 7 2 103 0.8 0.802 81 8 2 103 0.9 0.902 108

SAS document we have for PROC POWER may have many errors, the corrected version can be obtained through http://ftp.sas.com/techsup/download/stat/power.pdf

Standardized quantity is our test statistic T = x μ 0 S/ n = n X μ 0 S if two-sided, look at Pr > T The large the n, the smaller the p. It is impossible not to reject H 0.

Confidence Interval and Significance Testing In theory, they are closely related. Confidence interval approach uses the sample statistic to find out what parameter values make this observed statistic most plausible; Significance testing fix a parameter value and asks what sample statistics are consistent with the fixed parameter value.

Recall Lower limit (L): the lowest parameter could make the observed one x become the 97.5% quantile cutoff point, i.e., a right tail test. x L S/ n = z 97.5 L = x 1.96S/ n Upper limit (U): the highest parameter could make the observed one x become the 2.5% quantile cutoff point, i.e., a left tail test. x U S/ n = z 2.5 U = x +1.96S/ n The values of the parameter inside the 95% confidence interval are precisely those which would not be contradicted by a two-sided test at 5% level.

It is a coincidence that L and U are symmetric about x in this simplest case. In general L and U are asymmetric about the sample estimate, just like our faces usually are asymmetric about our noses. L x and U x are called margins of errors.

Validity of a statistical procedure In practice, one data set cannot tell you if the procedure is valid. One can either use theory or simulation study, or both to justify.

If we want to know whether a sample size of 10 could make a confidence interval procedure for exponential mean valid, we could draw 10000 samples from exponential distribution, with each having 10 observations; use a procedure to construct a 95% CI s with each sample data, resulting in 10000 CI s; count how many of these 10000 CI s cover the true mean. If close to 9500, then the procedure is valid, otherwise, it is not. If we want to know a hypothesis testing procedure valid when sample size is 10, draw 10000 samples from normal distribution, with each having 10 observations; use the procedure to test the hypothesis with each sample at 5% level, result in 10000 conclusions (either reject or not reject); count how many rejections; if the rejection is close to 5%, then the procedure is valid, otherwise it s not.