Statistics: CI, Tolerance Intervals, Exceedance, and Hypothesis Lecture Notes 1 Confidence intervals on mean Normal Distribution CL = x ± t * 1-α 1- α,n-1 s n Log-Normal Distribution CL = exp 1-α CL1- = exp α sln x x ln x ± t 1- α, n-1 * n ln(gsd ln(gm) ± t1- α, n-1 * n for 2-sided 95% confidence interval t 1-α/2,n-1 = t 0.975,n-1 2 for 1-sided 95% confidence interval t 1-α,n-1 = t 0.95,n-1 Warren Myers 2001-2005, Revised by Guffey, 2006 1
Example #1 confidence interval Given a GM of 24 a GSD of 2.7 and a sample size of 17, determine a 2-side, 95% confidence interval on the GM. 3 Example #1 solution CL s ln(gsd * ln(gm) * n ln x xln x ± t1- α, n-1 ± t1- α, n-1 1- = exp n = exp α ln( 2.7 ln(24) ± * 3.178 ± 2.12 * 0.241 t.95, 16 = exp 17 = exp ( ( )) ( 3.178 2.12 *(.241)) LCL95% = exp = 14.4 ( 3.178 + 2.12 * ( 0.241) ) UCL95% = exp = 40 4 We are 95% confident the true population mean is between 14.4 and 40. Warren Myers 2001-2005, Revised by Guffey, 2006 2
95%, 95% Tolerance Limits on Sample Distribution Normal Distribution TL = x ± s 0.95,0.95 Kγ,P,n Log-Normal Distribution TL0.95,0.95 = exp = exp ( x ± Kγ, P, n * sln x) ln x ( ln( GM ) ± Kγ, P, n * ln( GSD) ) 5 Where: γ is the probability that at least a proportion P or the population occurs Example #2 tolerance limits Determine an upper 95%, 95% tolerance limit on a distribution of exposures defined with a GM of 24 a GSD of 2.7 and a sample size of 17. 6 Warren Myers 2001-2005, Revised by Guffey, 2006 3
Example #2 solution UTL0.95,0.95 = exp = exp ( ln( GM ) + K0.95,0. 95, 17 * ln( GSD) ) ( ln(24) + 2.486 * ln(2.7) ) ( 5.647) = exp = 284 7 We are 95% confident that 95% of the exposures in the distribution will be less than 284. Exceedance Fraction For normally or lognormally distributed data, an estimate of the fraction (F) of future exposures (sometimes called the exceedence fraction) that exceed a particular level can be calculated as: normal lognorm OEL x) Exc. Fract.= P( c > OEL) = P Z > s ln( OEL) ln( GM ) Exc. Fract.= P( c > OEL) = P Z > ln( GSD) 8 Probability that a future exposure measurement made from the same sample group will exceed your chosen exposure limit is percent. Warren Myers 2001-2005, Revised by Guffey, 2006 4
Example #3 exceedance Given an OEL of 35, a GM =24, GSD = 2.7, what is the probability that a future exposure measurement made from the same sample group will exceed the exposure limit. 9 Example #3 solution F = P(c > OEL) = P z > ln(oel)-ln(gm) ln(gsd) ( ) ( ) ln( 2.7 ) ln 35 - ln 24 = P z > = PZ ( > 0.38) = 1- P(z < 0.38) = 1-0.648 = 35.2% 10 The probability that a future exposure measurement made from this same sample group will exceed 40 is 30.5% Warren Myers 2001-2005, Revised by Guffey, 2006 5
Example #4 exceedance Given: OEL of 35 GM =24 GSD = 2.7 What would you need to have your GM value to be if you wanted to be 80% sure that a future exposure measurement made from the same sample group will not exceed the exposure limit. (or have an exceedance fraction of than 20%) 11 Solution #4 Z p = ln(oel) ln( GM ) ln(gsd) ln( GM ) = ln(oel) Z ln(gsd) = ln(35) 0.842 ln(2.7) = 3.555 0.8363 = 2.7187 p 12 2.7187 GM = e = 15 Warren Myers 2001-2005, Revised by Guffey, 2006 6
Hypothesis Used to choose between two alternatives (in regression we test all the time whether the slope of the regression line is equal to 0 or not) We begin by stating the problem and then stating the null hypothesis and the alternative hypothesis Null hypothesis states that the results we observe are due to chance Null hypothesis always that variable is actually constant Alternative hypothesis states is unlikely to be true unless the null hypothesis is likely to be false. 13 Hypothesis testing Calculate a test statistic (choice of statistic depends on the sample size and on the distribution of the statistic, we saw that sample averages are distributed normal) Find the P-value associated with the test statistics 14 Warren Myers 2001-2005, Revised by Guffey, 2006 7
Hypothesis testing - Significance Level Significance level is the threshold at which we are confident to reject Ho or we fail to reject the Ho. Choice of the significance level depends on the problem. Typically: 5% is called statistically significant, 1% - highly significant 15 Type I versus Type II error: Statistical Decision True state of null Reject null Do not reject null True Type I error Correct Not true Correct Type II error 16 Warren Myers 2001-2005, Revised by Guffey, 2006 8
t-tests 17 For testing hypothesis involving two means (>, < or different from a given pop mean) Use T-test when: significance of the mean(s) the number of subjects in the sample is less than 30 continuous data true standard deviation of the population is unknown Hypothesis construction for Ho: x = u Ha: x > u Ha: x < u Ha: x <> u Determine level of significance α = 0.05 α = 0.01 Difference in means; assumed to be equal variance t = S p X a X b 1 1 + n n a b S p = ( 1) + ( 1) s n s n 2 2 a a b b n + n 2 a b 18 df = n a + n b 2 Where: df = degrees of freedom X = observed mean of each group sample s 2 = observed variance of each sample S p = pooled estimate of the standard deviation n = number of samples Warren Myers 2001-2005, Revised by Guffey, 2006 9
Decision rule If T > t α, n-1 null hypothesis is rejected alternative is accepted. P(error if reject) = α 19 t-test example 20 Warren Myers 2001-2005, Revised by Guffey, 2006 10
z-tests 21 For testing hypothesis involving two means (>, <, <>) Test assumptions n > 25 normal distribution standard deviation of the pop is known Hypothesis construction for Ho: x = u Ha: x > u Ha: x < u Ha: x <> u Determine level of significance α = 0.05 α = 0.01 Test formula z = X a σ n b 2 2 a b a X σ + n b Where: X = observed mean of each group σ 2 = true variance of each group n = number of samples 22 Warren Myers 2001-2005, Revised by Guffey, 2006 11
Decision rule If Z > Z α null hypothesis is rejected alternative is accepted. The probability of incorrectly rejecting the null hypothesis (Type I error) because of the results is equal α 23 z-test example 24 Warren Myers 2001-2005, Revised by Guffey, 2006 12
25 Warren Myers 2001-2005, Revised by Guffey, 2006 13