Statistics 20: Final Exam Solutions Summer Session PDF Free Download

1. 20 poits Testig for Diabetes. Statistics 20: Fial Exam Solutios Summer Sessio 2007 (a) 3 poits Give estimates for the sesitivity of Test I ad of Test II. Solutio: 156 patiets out of total 223 patiets were tested positive by Test I. Hece the estimated sesitivity of Test I is 156 = 0.6996. 223 Similarly, the estimated sesitivity of Test II is 200 = 0.8969. 223 (b) 10 poits Are the results of the two tests idepedet? Use a χ 2 -test to test for idepedece. Solutio: We wat to test H 0 : Test I ad Test II are idepedet vs. H a : The two tests are ot idepedet. The expected couts for the four cells uder the ull hypothesis are as follows, Test II Test I positive egative positive 139.9103 16.0897 egative 60.0897 6.9103 The value of the χ 2 test statistic for the above hypothesis is (142 139.9103) X 2 2 (14 16.0897)2 (58 60.0897)2 = + + + 139.9103 16.0897 60.0897 = 1.0072. (9 6.9103)2 6.9103 ad the degree of freedom is =(2 1)(2 1) = 1. Fro the χ 2 table we have the P-value for this test is more tha 0.25. Hece we fail to reject the ull hypothesis that the two tests are idepedet at ay reasoable level. (c) 12 poits We assume that Test I has sesitivity, 72%, ad specificity, 80%. We also asume that 5% of our populatio has diabetes. i. 4 poits What is the probability that a radomly selected idividual from the populatio will be tested positive by Test I? Solutio: P A radomly selected idividual will be tested positive by Test I = P positive result by Test I P the perso has diabetes the perso has diabetes + P positive result by Test I the perso does t have diabetes P The perso does t have diabetes = 72 100 5 ( 100 + 1 80 ) 95 72 5 + 20 95 = = 0.226. 100 100 10000 1

2. 10 poits ii. 4 poits What is the probability that a radom selected idividual from the populatio has diabetes give his/her Test I result is positive? Solutio: P A radomly selected idividual has Diabetes ad his/her Test I result is positive = P Positive result by Test I P The perso has diabetes The perso has diabetes = 72 100 5 100 = 0.036. Hece, the coditioal probability that a radom selected idividual from the populatio has diabetes give his/her Test I result is positive is = 0.036 = 0.226 0.1593. iii. 4 poits What is the probability that a radom selected idividual from the populatio will be tested positive be Test I give he/she has diabetes? Solutio: The coditioal probability that a radom selected idividual from the populatio will be tested positive be Test I give he/she has diabetes is the sesitivity of Test I which is 0.72. (a) 5 poits Fid α = the probability of a Type I error, that is, the probability that H 0 is rejected whe actually H 0 is true. Solutio: α = P( x > 0 µ = 0) = 0.5 sice x follows ormal distributio with mea µ ad variace σ 2 /. (b) 5 poits Fid the power of this test whe µ = 0.2. Solutio: Power of this test whe µ = 0.2 is P( x > 0 µ = 0.2) ( x 0.2 = P σ/ > 0.2 ) σ/ 3. 20 poits Votig. ( = P Z > 0.2 ) 1/ = P(Z > 0.8) = 0.7881. 16 (a) 5 poits Let p be the probability that the aswer of a surveyed voter is yes. What is the expressio of p i terms of q? Solutio: p = P(the aswer of a surveyed voter is yes ) = Pthe aswer of a surveyed voter is yes the voter actually voted Pa surveyed voter actually voted + Pthe aswer of a surveyed voter is yes the voter did t vote Pa surveyed voter did t vote = 1 0.66 + q 0.34 = 0.66 + 0.34q. 2

(b) 5 poits Solutio: We kow that for ˆp = X/, Hece E(ˆp) = p ad Var(ˆp) = p(1 p). E(ˆq) = E( 1.94 + 2.94ˆp) = 1.94 + 2.94E(ˆp) = 1.94 + 2.94p = q ad Std(ˆq) = Std( 1.94 + 2.94ˆp) = 2.94 Std(ˆp) = 2.94 p(1 p) Now usig the fact that q = 1.94 + 2.94q we have p = q + 1.94 2.94. Hece, p(1 p) (q + 1.94)(2.94 q 1.94) (1.94 + q)(1 q) 2.94 = 2.94 =. (2.94) 2 Hece stadard deviatio of ˆq is (1.94 + q)(1 q)/. (c) 5 poits Suppose X = 80 was observed with = 100. What is your estimate for q? Give a approximate 99% cofidece iterval for q. Solutio: Here ˆp = X = 80 = 0.8. So the estimate for q is 100 ˆq = 1.94 + 2.94ˆp = 1.94 + 2.94 0.8 = 0.412. The estimated stadard deviatio of ˆq is ˆp(1 ˆp) 0.8 0.2 SEˆq = 2.94 = 2.94 = 0.1176. 100 Now ˆq approximately follows N(0, 1) distributio. Hece a 99% cofidece SEˆq iterval for q is (ˆq ± z SEˆq ) where z is the 99% ormal cutoff poit 2.576. So the 99% C.I. is (0.412 ± 2.576 0.1176) = (0.412 ± 0.303) = (0.109, 0.715). (d) 5 poits Solutio: The margi of error i a 99% cofidece iterval for q is 2.576 SEˆq = ˆp(1 ˆp) 2.576 2.94. Usig the fact that p 0.66 we have margi of error is 0.66 0.34 12.871 less tha 2.576 2.94 =. Hece 12.871 0.05 12.871 (0.05) = 5148.4 2 5149. 3

4. 15 poits Plates for Glass (a) 6 poits Solutio: A ubiased estimate for µ X µ Y is X Ȳ = 469 463 = 6. Here our assumptios are The X ad Y samples are idepedet. The populatio variaces are equal. So a ubiased estimate for the commo variace of X ad Y is the pooled variace s 2 p = (5 1) s2 x + (5 1) s2 y 5 + 5 2 = 4 839.5 + 4 916.5 8 = 878. i.e. s p = 29.63. So estimated stadard error of X Ȳ is SE X Ȳ = s p 1 5 + 1 5 = 29.63 0.4 = 18.740. Now ( X Ȳ ) (µ X µ Y ) SE X Ȳ follows t-distributio with 5 + 5 2 = 8 degrees of freedom. 95% cutoff poit for t-distributio with 8 d.f. is t = 2.307. Hece a 95% cofidece iterval for µ X µ Y is ( X Ȳ ) ± t SE X Ȳ = 6 ± 2.307 18.74 = 37.233, 49.233. Sice 0 is i the 95% cofidece iterval of µ X µ Y, we fail to reject the ull hypothesis that the averages for the two processes are same at 5% sigificace level. (b) 6 poits Solutio: Here a ubiased estimate for µ X µ Y is agai X Ȳ = 469 463 = 6. But the assumptios are The X ad Y samples are paired. So the pairs (X i, Y i ) are idepedet. So estimated stadard error of X Ȳ is SE X Ȳ = s differece = 21.5/5 = 2.074. Now ( X Ȳ ) (µ X µ Y ) SE X Ȳ follows t-distributio with 5 1 = 4 degrees of freedom. 95% cutoff poit for t-distributio with 4 d.f. is t = 2.777. Hece a 95% cofidece iterval for µ X µ Y is ( X Ȳ ) ± t SE X Ȳ = 6 ± 2.777 2.074 = 0.241, 11.759. Sice 0 is ot i the 95% cofidece iterval of µ X µ Y, we reject the ull hypothesis that the averages for the two processes are same at 5% sigificace level. 4

(c) 3 poits Solutio: (Less variace due to positive correlatio betwee the pairs. Removal of the effect of lurkig variables.) 5. 20 poits Predictio of ozoe level (a) 3 poits Write dow the multiple regressio equatio. Solutio: The estimated regressio equatio is OZONE = 388.4121 0.1957033 YEAR + 0.0342877 RAIN. (b) 7 poits Solutio: The missig value for the t-statistic for RAIN is b RAIN se brain = 0.0342877 0.0096548 = 3.5514. The error degrees of freedom is p 1 = 13 2 1 = 10. So b RAIN β RAIN se brain follows t-distributio with 10 degrees of freedom. Now 95% cutoff poit for t- distributio with 10 d.f. is t = 2.229. Hece a 95% cofidece iterval for the regressio parameter for rai (β RAIN ) is b RAIN ± t se brain = 0.0342877 ± 2.229 0.0096548 = 0.012767, 0.055808. (c) 4 poits Solutio: For the regressio model the degrees of freedom is p = 2 ad the regressio mea square is MSR=10.3680841/2 = 5.18404205. The error degrees of freedom is 12 2 = 10 ad the mea square error is MSE=1.03960755/10 = 0.103960755. (d) 6 poits Solutio: The value of the F-statistic used for this test is F = MSR MSE = 5.18404205 0.103960755 = 49.8654. The degrees of freedom for the F-statistic is (p, 1 p) = (2, 10). 6. 10 poits For the items below, select True or False. (a) 2 poits If the correlatio betwee two radom variables x ad Y is egative, the Var(X + Y ) < Var(X Y ). Solutio: TRUE. Note that Var(X ± Y ) = σ 2 X + σ2 Y ± 2ρσ Xσ Y. Hece if the correlatio ρ betwee X ad Y is egative we have Var(X +Y ) Var(X Y ) = 4ρσ X σ Y < 0. 5

(b) 2 poits For two radom variables X ad Y, if E(X Y ) = E(X + Y ), the E(Y ) must be equal to 0. Solutio: TRUE. E(X Y ) = E(X+Y ) implies E(X) E(Y ) = E(X)+E(Y ), so E(Y ) = 0. (c) 2 poits If we fail to reject a ull hypothesis H 0 at the 0.05 sigificat level, the there is a 95% probability that H 0 is true. Solutio: FALSE. Probability of H 0 beig TRUE is 0 or 1. (d) 2 poits For a specified sample size, the margi of error for a cofidece iterval for a populatio mea µ icreases as the cofidece level icreases. Solutio: TRUE. Note that margi of error is z σ for kow variace ad t s for ukow variace. Ad the cutoff z or t icreases as the cofidece level icreases. (e) 2 poits I order to calculate a P-value, you must kow the distributio of the test statistic uder the alterative hypothesis H a. Solutio: FALSE. We must kow the distributio of the test statistic uder the ull hypothesis H 0. 6

Statistics 20: Final Exam Solutions Summer Session 2007