BIO5312 Biostatistics Lecture 6: Statistical hypothesis testings

Size: px

Start display at page:

Download "BIO5312 Biostatistics Lecture 6: Statistical hypothesis testings"

Irma Foster
5 years ago
Views:

1 BIO5312 Biostatistics Lecture 6: Statistical hypothesis testings Yujin Chung October 4th, 2016 Fall 2016 Yujin Chung Lec6: Statistical hypothesis testings Fall /30

2 Previous Two types of statistical inferences: Estimation: concerned with estimating the values of specific population parameters. These specific values are referred to as point estimates. Sometimes, interval estimation is carried out to specify an interval which likely includes the parameter values. Hypothesis testing: concerned with testing whether the value of a population parameter is equal to some specific value Yujin Chung Lec6: Statistical hypothesis testings Fall /30

3 Hypothesis testing Philosophy: prove a claim by contradiction. Analogy: dependent love story Claim : You don t love me. Reasoning : If you loved me, you would take the trash out every week and put your socks away. Data : Some weeks you don t take the trash out or leave your socks where they fall. Conclusion : You don t love me. Yujin Chung Lec6: Statistical hypothesis testings Fall /30

4 Statistical hypothesis testing Hypothesis-testing framework specifies two hypotheses: null and alternative hypothesis The null hypothesis (H0) is often an initial claim that researchers specify using previous research or knowledge. Typically it is a statement that the value of a population parameter (such as proportion, mean, or standard deviation) is equal to some claimed value. The alternative hypothesis (H1) is what you might believe to be true or hope to prove true. H0 : you love me vs. H1 : you don t love me Hypothesis-testing provides an objective framework for making decisions using probabilities methods, rather than relying on subjective impressions. Yujin Chung Lec6: Statistical hypothesis testings Fall /30

5 Examples The average of cholesterol level in children is 175mg/dL. A group of men who have died from heart disease within the past year are identified, and the cholesterol of their offspring are measured. (1) Is the average cholesterol level of these children larger than 175mg/dL? (2) Is the average cholesterol level different from that of children whose fathers do not have a history of heart disease? µ 1 : the population mean of cholesterol level in the case group µ 2 : the poulation mean of cholesterol level in the control group (1) H0: µ1 = 175 vs. H1: µ 1 > 175 (2) H0: µ1 = µ 2 vs. H1: µ 1 µ 2 Are the IQ and the number of finger-wrist taps (fwt) of children in the lead exposed group different from those of children in the control group? H0: the population mean of fwt in the two groups are the same vs. H1: the means are different Yujin Chung Lec6: Statistical hypothesis testings Fall /30

6 Four possible outcomes in hypothesis testing No reject H0 Reject H0 H0 true true negative false positive (1 α) Type I error (α) H1 true false negative true positive Type II error (β) Power (1 β) Two possible errors type I error (α): Pr(Reject H0 H0 true). commonly referred to as the significance level of a test. type II error (β): Pr(Not reject H0 H1 true) The power of a test: 1 β = Pr(Reject H0 H1 true) We prefer a test with small α and large power (1 β). Statistical hypothesis test: the greatest power (1 β) among all possible tests of a given type I error α Yujin Chung Lec6: Statistical hypothesis testings Fall /30

7 t-test for the Mean We assume the cholesterol levels in children follow N(µ, σ 2 ). We wish to test whether the cholesterol levels of children with family history is same as 175mg/dL, the average cholesterol without family or larger than 175. Hypotheses H 0 : µ = 175 vs H 1 : µ > 175. Logic: 1 Assume H 0 is true 2 If x is too large, it is a contradiction to the assumption that H 0 is true. Yujin Chung Lec6: Statistical hypothesis testings Fall /30

8 t-test for the mean: critical-value method H 0 : µ = 175 vs H 1 : µ > 175. The distribution of X under H0 Since X 1,..., X n N(175, σ 2 ), t = X 175 S/ n t n 1. Critical value method: H0:µ=175 vs. H1:µ>175 Density function of t n Acceptance region Rejection region 4 0 t n 11 α 4 Critical value: t n 1,1 α If t > t n 1,1 α, reject H 0 ; if t t n 1,α, not reject H 0 Type I error: Pr(t > t n 1,1 α H0) = α; Yujin Chung Lec6: Statistical hypothesis testings Fall /30

9 t-test for the Mean: p-value method H 0 : µ = 175 vs H 1 : µ > 175. Test statistic: t = X 175 S/ n. Under H 0 : test statistic t t n 1 p-value: Pr(t > t(obs) H 0 ), the probability of obtaining a test statistic as extreme as or more extreme than the actual test statistic value, given that H 0 is true. p value for the test: H0:µ=175 vs. H1:µ>175 Density function of t n Rejection region p value: Pr(t>t(obs)) 4 0 t n 11 α t(obs) 4 If p-value < α, reject H 0 ; if p-value α, not reject H 0. Yujin Chung Lec6: Statistical hypothesis testings Fall /30

10 Significance level: α H 0 is rejected if t > t n 1,1 α or p-value< α. α: significance level, type-i error, typically set to 0.05 Guidelines for judging the significance of a p-value If p 0.05, then the results are considered not statistically significant If 0.01 p < 0.05, then the results are statistically significant If p < 0.01, then the results are highly significant If p < 0.001, then the results are very highly significant Report an exact p-value! The p-value indicates exactly how significant the results are without performing repeated significance tests at different α levels. The p-value indicate how close to statistical significance the results have come even when they are not statistically significant Yujin Chung Lec6: Statistical hypothesis testings Fall /30

11 Example: the cholesterol of children Suppose the mean cholesterol level of 10 children whose fathers died from heart disease is 200 mg/dl and the sample standard deviation is 50 mg/dl. The average of cholesterol level in children is known as 175mg/dL. Is the average cholesterol level of these children larger than 175mg/dL? Let µ be the population mean cholesterol level of children whose fathers died from heart disease. The hypotheses are H 0 : µ = 175 vs. H 1 : µ > 175. The test statistic is t = x µ 0 s/ n and follows t n 1 under H 0. The observed test statistic is / n = Critical-value method At the significance level 5%, the critical value is t n 1,1 α = t 9,0.95 = and the rejection region is t > Since t(obs) = 1.58 < 1.833, we cannot reject H 0 at significance level 5% Yujin Chung Lec6: Statistical hypothesis testings Fall /30

12 Example: the cholesterol of children Suppose the mean cholesterol level of 10 children whose fathers died from heart disease is 200 mg/dl and the sample standard deviation is 50 mg/dl. The average of cholesterol level in children is known as 175mg/dL. Is the average cholesterol level of these children larger than 175mg/dL? Let µ be the population mean cholesterol level of children whose fathers died from heart disease. The hypotheses are H 0 : µ = 175 vs. H 1 : µ > 175. The test statistic is t = x µ0 s/ n and follows t n 1 = t 9 under H 0. The observed test statistic is / n = p-value method The p-value is p = Pr(t > t(obs) H 0 ) = Pr(t > 1.58 H 0 ) = Since p > 0.05, we cannot reject H 0 at significance level 5% Yujin Chung Lec6: Statistical hypothesis testings Fall /30

13 One-tailed t-test for the mean A one-tailed test is a test in which the values of the parameter being studied under the alternative hypothesis are allowed to be either greater than or less than the values of the parameter under the null hypothesis (µ 0 ) but not both. H 0 : µ = µ 0 test statistic: t = X µ 0 S/ n H 1 rejection region p-value µ > µ 0 t > t n 1,1 α Pr(t > t(obs) H 0 ) µ < µ 0 t < t n 1,α Pr(t < t(obs) H 0 ) Yujin Chung Lec6: Statistical hypothesis testings Fall /30

14 Two-sided alternatives The test for H0 : µ = µ 0 vs. H1 : µ µ 0 is based on t = x µ 0 s/ n. Critical-value method Rejection region: If t > t n 1,1 α/2, then H0 is rejected. Acceptance region: If t > tn 1,1 α/2, then H0 is NOT rejected. Type-I error: Pr( t > tn 1,1 α/2 H 0 ) = α. p-value method: p-value is Pr( t > t(obs) H 0 ). If p < 0.05, then H 0 is rejected at significance level 5%. Yujin Chung Lec6: Statistical hypothesis testings Fall /30

15 Example: two-sided alternatives (continued from the cholesterol data) Is the average cholesterol level of children whose fathers had heart disease different from the US average cholesterol level (175) of children? The hypotheses are H 0 : µ = 175 vs. H 1 : µ 175. The observed test statistic is t(obs) = Critical-value method: At significance level of 5%, the rejection region is t > or t < Since the observed test statistic is < 1.58 < 2.262, we cannot reject H 0 at significance level 5%. p-value method: p = Pr( t > t(obs) H 0 ) = Pr(t > 1.58 H 0 ) + Pr(t < 1.58 H 0 ) = Since p > 0.05, we cannot reject H 0 at significance level 5%. Yujin Chung Lec6: Statistical hypothesis testings Fall /30

16 The Relationship Between Hypothesis Testing and Confidence Intervals Suppose we are testing H0 : µ = µ 0 vs. H1 : µ µ 0. H 0 is rejected at significance level α, if and only if 100% (1 α) CI for µ does not contain µ 0. Recall that 100%(1 α) CI for µ is x ± t n 1,1 α/2 s/ n. If H 0 is rejected, t = x µ 0 s/ n < t n 1,1 α/2 or t > t n 1,1 α/2 x µ 0 < t n 1,1 α/2 s/ n or x µ 0 > t n 1,1 α/2 s/ n µ 0 > x + t n 1,1 α/2 s/ n or µ 0 < x t n 1,1 α/2 s/ n Therefore, 100%(1 α) CI does not include µ 0. Similarly, the inverse can be proved. Yujin Chung Lec6: Statistical hypothesis testings Fall /30

17 z-test for the Mean with Known Variance Let X 1,..., X n N(µ, σ 2 ) and σ 2 is known. The test for H 0 : µ = µ 0 is based on the test statistic Z = X µ 0 σ/ n which follows N(0, 1) under H 0. H 1 Rejection region p-value µ > µ 0 z > z 1 α Pr(Z > z(obs) H 0 ) µ < µ 0 z < z α Pr(Z < z(obs) H 0 ) µ µ 0 z > z 1 α/2 Pr( Z > z(obs) H 0 ) Yujin Chung Lec6: Statistical hypothesis testings Fall /30

18 Tests for Binomial probability p Let X Binomial(n, p). We want to test H 0 : p = p 0 vs. H 1 : p p 0. ˆp p 0 The test statistic is Z =, where ˆp = X/n. If p0 (1 p 0 )/n np 0 (1 p 0 ) 5, Z N(0, 1) under H 0. Rejection region: z(obs) < z α/2 or z(obs) > z 1 α/2 p-value: If ˆp p0, then p = 2 Pr(Z < z(obs) H 0 ) If ˆp > p0, then p = 2 Pr(Z > z(obs) H 0 ) Yujin Chung Lec6: Statistical hypothesis testings Fall /30

19 Two-sample case In a two-sample hypothesis-testing problem, the underlying parameters of two different populations are compared. fwt left and right maxfwt in the control group and exposed group Independent samples: when the data points in one sample are unrelated to the data points in the second sample. Yujin Chung Lec6: Statistical hypothesis testings Fall /30

20 The paired sample: Paired t-test Paired sample: when each data point in the first sample is matched and is related to a unique data point in the second sample. Paired samples may represent two sets of measurements on the same people or on different people who are chosen on an individual basis using matching criteria, such as age and sex, to be very similar to each other. LEAD data: The numbers of right-hand and left-hand finger-wrist tapping (fwt r and fwt l), respectively, were observed from each of 124 children. We want to test whether the number of finger-wrist tapping is different between right hand and left hand. Yujin Chung Lec6: Statistical hypothesis testings Fall /30

21 The paired sample: paired t-test Consider two samples: (X 1,1, X 2,1 ),..., (X 1,n, X 2,n ), where E(X 1,i ) = µ 1 and E(X 2,i ) = µ 2 for all i = 1,..., n. We want to test H 0 : µ 1 = µ 2 vs. H 1 : µ 1 µ 2. Let = µ 1 µ 2. Then, H 0 : = 0 vs. H 1 : 0 To get rid of the correlation X 1,i and X 2,i, we consider the differences d i = X 1,i X 2,i for i = 1,..., n. We assume d 1,..., d n N(, σd 2 ). It is a one-sample t-test problem. Test statistic: t = d s d / n, where d and s d are the sample mean and standard deviation of the differences, respectively. Under H 0 : t t n 1 p-value= 2 Pr(t > t(obs) H 0 ) CI: d ± t n 1,1 α/2 s d / n Yujin Chung Lec6: Statistical hypothesis testings Fall /30

22 Example: Lead data The numbers of right-hand and left-hand finger-wrist tapping (fwt r and fwt l), respectively, were observed from each of 124 children. We want to test whether the number of finger-wrist tapping is different between right hand and left hand. Since fwt r and fwt l are not independent and paired sample, we consider the difference of them. H 0 : = 0 vs. H 1 : 0 mean of difference: d = and s.d. sd = test statistic: t = / 124 = The distribution of test statistic: t p-value: 2 Pr(t > ) = At significance level 5%, we reject the null. Right- and left- hand fwt are significantly different. Yujin Chung Lec6: Statistical hypothesis testings Fall /30

23 Two independent samples: equal variances Consider two independent samples: X 1,1,..., X 1,n1 N(µ 1, σ 2 1 ) (sample size n 1 ) and X 2,1,..., X 2,n2 N(µ 2, σ 2 2 ) (sample size n 2). We want to test H 0 : µ 1 = µ 2 vs. H 1 : µ 1 µ 2. We assume σ 2 = σ 2 1 = σ2 2. X 1 N(µ 1, σ 2 /n 1 ), X2 N(µ 2, σ 2 /n 2 ) X 1 X 2 N(µ 1 µ 2, σ 2 /n 1 + σ 2 /n 2 ) the pooled variance estimation of σ 2 : s 2 = (n 1 1)s (n 2 1)s 2 2, weighted average of s 2 1 n 1 + n 2 2 and s2 2 X 1 Test statistic: t = X 2 s t n1 +n 2 2 under H 0. 1/n 1 + 1/n 2 Rejection region: t(obs) > t n1 +n 2 2,1 α/2 or t(obs) < t n1 +n 2 2,1 α/2 p-value= 2 Pr(t > t(obs) H 0 ) CI: ( x 1 x 2 ) ± t n1 +n 2 2,1 α/2s 2 1/n 1 + 1/n 2 Yujin Chung Lec6: Statistical hypothesis testings Fall /30

24 Example: Lead data We now assume the variances of maxfwt in the control (n 1 = 78) and exposed group (n 2 = 46) are the same. We want to test for H 0 : µ 1 = µ 2 vs. H 1 : µ 1 µ 2. sample means x 1 = 62.44, x 2 = 59.76; sample variances: s 2 1 = and s2 2 = pooled variance: s 2 (78 1) (46 1) = = test statistic: t = = (1/78 + 1/46) Under H 0, t t 122 (df= =122) p-value: 2 Pr(t > ) = At significance level 5%, we cannot reject the null hypothesis. No evidence of different means. Yujin Chung Lec6: Statistical hypothesis testings Fall /30

25 Two independent samples: different variances Two samples: X 1,1,..., X 1,n1 N(µ 1, σ 2 1 ) (sample size n 1) and X 2,1,..., X 2,n2 N(µ 2, σ 2 2 ) (sample size n 2). We want to test H 0 : µ 1 = µ 2 vs. H 1 : µ 1 µ 2. X 1 X 2 N(µ 1 µ 2, σ1 2/n 1 + σ1 2/n 2) X 1 Test statistic: t = X 2 s 2 1 /n 1 + s 2 2 /n 2 Under H 0 : the test statistic approximately follows t-distribution with d.f. d (s 2 1 = /n 1 + s 2 2 /n 2) 2 (s 2 1 /n 1) 2 /(n 1 1) + (s 2 2 /n 2) 2 /(n 2 1) Rejection region: t(obs) > t d,1 α/2 or t(obs) < t d,1 α/2 p-value= 2 Pr(t > t(obs) H 0 ) CI: ( x 1 x 2 ) ± t d,1 α/2 s 2 1 /n 1 + s 2 2 /n 2 Yujin Chung Lec6: Statistical hypothesis testings Fall /30

26 F-test for the Equal Variances Two samples: X 1,1,..., X 1,n1 N(µ 1, σ 2 1 ) (sample size n 1) and X 2,1,..., X 2,n2 N(µ 2, σ 2 2 ) (sample size n 2). We want to test H 0 : σ 2 1 = σ2 2 vs. H 1 : σ 2 1 σ2 2. In other words, H 0 : σ 2 1 /σ2 2 = 1 vs. H 1 : σ 2 1 /σ2 2 1 (n 1 1)S 2 1 /σ2 1 χ2 n 1 1 and (n 2 2)S 2 2 /σ2 2 χ2 n 2 1 Test statistic: F = S2 1 S2 2 F n1 1,n 2 1 under H 0 Rejection region: F (obs) > F n1 1,n 2 1,1 α/2 or F (obs) < F n2 1,n 1 1,α/2 p-value If F (obs) 1, then p = 2 Pr(F > F (obs) H0 ) If F (obs) < 1, then p = 2 Pr(F < F (obs) H0 ) Yujin Chung Lec6: Statistical hypothesis testings Fall /30

27 Example: Lead data Test whether the variances of maxfwt in the control (n 1 = 78) and exposed group (n 2 = 46) are the same or not. H 0 : σ 2 1 = σ2 2 vs. H 0 : σ 2 1 σ2 2 sample variances: s 2 1 = and s2 2 = test statistic: F = s 2 1 /s2 2 = / = The distribution of test statistic under H 0 : F 77,45 p-value: 2 Pr(F < ) = At significance level 5%, we cannot reject the null hypothesis. There is no evidence that the variances are different. Yujin Chung Lec6: Statistical hypothesis testings Fall /30

28 Overlapping Confidence Intervals and Statistical Significance Can we judge whether two statistics are significantly different depending on whether or not their confidence intervals overlap? The answer is: not always. If two statistics have non-overlapping confidence intervals, they are significantly different. If they have overlapping confidence intervals, it is not necessarily true that they are not significantly different. Yujin Chung Lec6: Statistical hypothesis testings Fall /30

Overlapping Confidence Intervals and Statistical Significance Assume x 1 x 2 0 without loss of generality. The means are significantly different if p (x 1 x 2 ) > 1.

29 Overlapping Confidence Intervals and Statistical Significance Assume x 1 x 2 0 without loss of generality. The means are significantly different if p (x 1 x 2 ) > 1.96 SE12 + SE22 CIs do not overlap if x SE1 > x SE2 which implies (x1 x2 ) > 1.96(SE1 + SE2 ) p Since SE12 + SE22 SE1 + SE2, Yujin Chung Lec6: Statistical hypothesis testings Fall /30

30 Summary 1 What are your hypotheses? 2 Identify data type and test statistic t-test, z-test, χ 2 -test, F-test one-sample or two-sample (paired or independent) 3 Perform a test 4 Go back to numerical and/or graphical summary and confirm your test result matches your data Yujin Chung Lec6: Statistical hypothesis testings Fall /30

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015

AMS7: WEEK 7. CLASS 1. More on Hypothesis Testing Monday May 11th, 2015 AMS7: WEEK 7. CLASS 1 More on Hypothesis Testing Monday May 11th, 2015 Testing a Claim about a Standard Deviation or a Variance We want to test claims about or 2 Example: Newborn babies from mothers taking