Outline. PubH 5450 Biostatistics I Prof. Carlin. Confidence Interval for the Mean. Part I. Reviews

Outline Outline PubH 5450 Biostatistics I Prof. Carlin Lecture 11 Confidence Interval for the Mean Known σ (population standard deviation): Part I Reviews σ x ± z 1 α/2 n Small n, normal population. Large n, any population. Unknown σ: s x ± t n 1,1 α/2 n Small n, normal population. Large n, near-normal population.

OneSampleTest:H 0 : µ = µ 0 p-values Known σ (population standard deviation), the z-statistic z x µ 0 σ/ n, has a N (0, 1) distribution if the population is normal, or if n is large. Unknown σ, thet-statistic H 1 : µ µ 0 : p = P( Z > z ) orp( T > t ). H 1 : µ>µ 0 : p = P(Z > z) orp(t > t). H 1 : µ<µ 0 : p = P(Z < z) orp(t < t). t x µ 0 s/ n, has a t n 1 distribution if the population is normal, or if n is large and the population is nearly normal. Interpretations Hypothesis First The long-run probability that the 95% CI for the mean covers the true population mean is 95%. If the population mean is truly µ 0, the probability of observing a z or t statistic as extreme or more extreme than yours is the p-value. In both cases, the probabilities refer to relative frequencies in an infinite number of repetitions of the experiment. If the hypothesis depends on the data ( data snooping ), then the requirement for identical experiments is violated, and the interpretation of the p-value becomes difficult/impossible. That is, the p-value will tend to overstate the evidence against a null that was generated by the data itself.

Misinterpretation of the CI Misinterpretations of the p-value For a 95% CI (a, b): The probability that the true population parameter is within (a, b) is 0.95. NO NO NO NO NO NO NO NO NO! If t =2.7 with d.f. = 18, and p =0.01, The probability that the null hypothesis is true is 0.01. The probability of observing z when null hypothesis is true is 0.01. The probability of observing such a difference due to chance is 0.01. The probability of finding a significant result in a replicate experiment is 0.99. p-values and Critical values Part II For H 0 : µ = µ 0,andatwo-sidedH 1, we reject the null when p =Pr ( T > t ) < 0.05, where t is the observed t-statistic. The bigger t is, the smaller the p-value. Alternatively, we can find t 0 > 0suchthat P( T > t 0 )=0.05 andthenrejectthenullif t > t 0. t 0 = t n 1,1 α/2 (what we called t in the CI context) Here, t 0 is called the critical value of the test.

Rejection Region Rejection Region We reject H 0 at significance level α if and only if t = x µ 0 s/ n > t n 1,1 α/2 x µ 0 > t n 1,1 α/2 x >µ 0 + t n 1,1 α/2 s n s n OR x <µ 0 t n 1,1 α/2 s n Definition The rejection region is the range of values of x (or whatever sample statistic we are using) for which H 0 is rejected. CI and Rejection Region CIs and Testing For H 0 : µ = µ 0, and significance level α, We do not reject H 0 if ( x µ 0 t n 1,1 α/2 s, µ 0 + t n n 1,1 α/2 s ). n Alternatively, we do not reject H 0 if ( µ 0 x t n 1,1 α/2 s n, x + t n 1,1 α/2 s ), n i.e., if the null mean value lies inside the 100(1 α)% CI for µ. In many situations, confidence intervals and hypothesis tests are equivalent. The confidence interval is often the non-rejection region of the corresponding hypothesis test. Always report the sample size, relevant sample statistic, and its CI. Report the p-value (and the sampling distribution for the test statistic) when appropriate, rather than just reject or fail to reject at some level α.

Paired Samples Summary Sometimes we want to compare two groups, but only within matched pairs, e.g., measurements of the same subjects before and after they receive some treatment While this initially appears to be a two-sample problem, the pairing and the lack of independence between the samples means that the best approach here is to first convert the problem into a one-sample problem by taking the differences of the pairs. SeeM&MExample7.7! If (X i, Y i ) are the paired samples, define D i = X i Y i,and construct a CI for the true mean difference µ D = µ X µ Y using our usual one-sample CI techniques. Or, test the hypothesis H 0 : µ x = µ Y µ D =0 using the usual one-sample test techniques! For paired samples, the sample size is the number of pairs, not the total number of data points A common mistake is treating paired two-sample problems as independent two-sample problems a big difference! In SAS: PROC TTEST using PAIRED statement PROC TTEST (V8.0+) data radon2 ; input before after @@; cards ; 91.9 97.8 111.4 122.3 105.4 95.0 103.8 99.6 96.6 119.3 104.8 101.7 ; proc ttest data = radon2 alpha = 0.05; paired before after ; run ; Statistics Lower CL Upper CL Lower CL Difference N Mean Mean Mean Std Dev before - after 6-16.27-3.633 9.0045 7.517 Statistics Upper CL Difference Std Dev Std Dev Std Err Minimum Maximum before - after 12.043 29.536 4.9163-22.7 10.4

PROC TTEST (V8.0+) One-Sample Test T-Tests Difference DF t Value Pr > t before - after 5-0.74 0.4931 data radondiff ; set radon2; diff = after before ; proc ttest data = radondiff h0 = 0.0 alpha = 0.05; var diff ; run ; Two-sample problems Part III The goal of inference is to compare the responses in two groups. Each group is considered to be a sample from a distinct distribution. The responses in each group are independent of those in the other group.

Two-sample z-statistic Unknown but Definition Ifasampleofsizen 1 is drawn from N (µ 1,σ1 2 ) and a sample of size n 2 is drawn from N (µ 2,σ2 2), suppose x 1 and x 2 are the sample means for each sample. Then the two-sample z statistic ) ( ) ( x1 x 2 µ1 µ 2 z = σ 2 1 n 1 + σ2 2 n 2 has the standard normal N (0, 1) distribution. When the population variances σ1 2 and σ2 2 are not known, the situation is a bit more complex. If the two populations can be assumed to have the same variance (i.e., σ1 2 = σ2 2 = σ2 ), then we can use the pooled estimator of σ 2, sp 2 = (n 1 1)s1 2 +(n 2 1)s2 2 n 1 + n 2 2 n1 i=1 = (x 1i x 1 ) 2 + n 2 i=1 (x 2i x 2 ) 2. n 1 + n 2 2 [Crummy] F -test for equal variances Pooled-sample t-test We can use the F -statistic, F = s2 1 s 2 2 to test for the equality of the two population variances, i.e., H 0 : σ2 1 σ 2 2 = 1. The alternative is H 1 : σ2 1 σ 2 2 1. F has an F -distribution with (n 1 1, n 2 1) degrees of freedom. Stat software is available to give p-values for this F -test... BUT the F is so nonrobust to (easily influenced by) departures from normality, it is rarely used in practice! (see M&M p.556) Definition When the two populations have the same variance, we can test H 0 : µ 1 = µ 2 using the pooled-sample t-statistic where s p = t = ( x 1 x 2 ) s p 1 n 1 + 1 n 2 s 2 p, the pooled standard error estimate defined above.

Pooled-sample t-test Confidence Interval Under H 0 : µ 1 = µ 2, the pooled sample t statistic has a t distribution with n 1 + n 2 2 degrees of freedom. For a two-sided alternative H 1 : µ 1 µ 2,thep-value is once again SeeM&MExample7.20 p = P( T > t ). Basedonsimilarargumentasinone-samplecase,wecanalso define a 100(1 α)% CI for the difference of the two population means: ( x 1 x 2 ) ± t s p 1 n 1 + 1 n 2 where t = t n1 +n 2 2,1 α/2 and s p is again the pooled standard error estimate. SeeM&MExample7.21. Unknown and For the more general case when the variances are not equal, the two-sample t-statistic is defined as: t = ( x 1 x 2 ) (1) s 2 1 n 1 + s2 2 n 2 which does not quite have a t-distribution under H 0 : µ 1 = µ 2... Do you still want to test for µ 1 = µ 2 when the variances are unequal? If you still do, we can use a t distribution to approximate the actual sampling distribution of the two-sample t-statistic. The degrees of freedom is not n 1 + n 2 2. It is smaller, and there are several possible formulae for calculating it. Typically we use either the smaller of n1 1andn 2 2, or the ludicrous formula on M&M p.536 Note that smaller df generally corresponds to wider confidence interval or bigger critical value (i.e., harder to reject H 0 ).

M&M Examples 7.14, 7.18!