BIOS 4110: Introduction to Biostatistics. Breheny. Lab #9

BIOS 4110: Itroductio to Biostatistics Brehey Lab #9 The Cetral Limit Theorem is very importat i the realm of statistics, ad today's lab will explore the applicatio of it i both categorical ad cotiuous data. We will review usig it for hypothesis testig ad for cofidece iterval costructio. Recall these importat formulas: I. CATEGORICAL Z = p p 0 p 0 (1 p 0 ) p 0 (1 p 0 ) is the stadard error of the sample proportio p is the sample proportio p 0 is the hypothesized value of the proportio This formula is used for hypothesis testig. The stadard error uses the hypothesized value of the true proportio i the populatio of iterest. Recall i the large sample settig, the distributio of the sample proportio is approximately ormal; therefore we ca use the Z distributio to approximate the probability of observig a sample proportio as extreme, or more extreme, give that the true proportio is equal to p 0. The followig figure shows the exact probability mass fuctio as the bars, ad the ormal probability desity fuctio as the lie. Notice how they are approximately the same. NOTE: = 6, p = 0.5

(1- α/2)x100% Cofidece iterval for the true proportio, p p (1 p ) p (1 p ) p Z α/2, p + Z α/2 p is the sample proportio Z α/2 is the critical value (depeds o the value of α) Oly works i large sample settig; p ca t be close to 0 or 1 ad has to be relatively large BOUNDED BY 0 AND 1 If we are iterested i providig a cofidece iterval for the true proportio i the populatio, we ca use the formula above. We are (1- α/2)x100% cofidet that the true proportio of [INSERT] i [POPULATION] is betwee [LOWER, UPPER]. II. CONTINUOUS T = x μ 0 s/ x is the sample mea s is the sample stadard deviatio is the sample size μ 0 is the hypothesized mea df is the degrees of freedom, which is equal to - 1 NOTES: This statistic is based off the Studet s curve, which is very similar to the ormal curve. However, the tails are thicker ad we have to accout for the degrees of freedom. This accouts for the extra variability because we have to estimate the stadard deviatio. Whe is greater tha about 50, the ormal distributio ad the Studet s curve are about the same. I practice, Studet s curve is used much more ofte. Note the degrees of freedom for T. (1- α/2)x100% Cofidece iterval for the true mea, µ x T α/2,df s/, x + T α/2,df s/ We are (1- α/2)x100% cofidet that the true mea [INSERT] i [POPULATION] is betwee [LOWER, UPPER]

Cofidece Itervals Iterpretatio If you were to repeat this process a ifiite umber of times, 95% of iterval estimates for μ created this way will cotai the true parameter value μ. We treat the populatio mea μ as beig fixed. Ay particular iterval may or may ot cotai the true populatio mea μ. We say we are 95% cofidet the iterval does cotai μ because the procedure used to costruct this iterval produces a correct iterval estimate 95% of the time. We do ot say there is a 95% probability that μ lies betwee these values. III. PRACTICE PROBLEMS 1. A sample of Alzheimer's patiets are tested to assess the amout of time i stage IV sleep. It has bee hypothesized that idividuals sufferig from Alzheimer's Disease may sped less time per ight i the deeper stages of sleep. Number of miutes spet is Stage IV sleep is recorded for sixty-oe patiets. The sample produced a mea of 45 miutes (S=14 miutes) of stage IV sleep over a 24 hour period of time. a) Will we eed to kow the uderlyig distributio of Stage IV sleep of the populatio of idividuals sufferig from Alzheimer s Disease to make ifereces o the sample mea? Explai. No we do ot. We have a sufficietly large sample size, therefore the sample mea will be approximately ormal. b) Suppose we wat to test if the true mea amout of time i stage IV sleep for this populatio is 49 miutes, which is the mea of the geeral populatio. Set up the hypothesis test, fid the critical value, report the p-value ad coclusio. t = 45 49 14/ 61. = -2.232 P(T < -2.32)*2 = 2*pt(-2.232, df = 60) = 0.029 = pvalue. We have sufficiet evidece that it is i fact differet from the geeral populatio (pvalue = 0.029).

c) Compute a 95 percet cofidece iterval for this data. What does this iformatio tell you about a particular idividual's (a Alzheimer's patiet) stage IV sleep? Critical value for T at sigificace level 0.05 ad df = 60 is 2. 45 ± 2 14 = (41.41, 48.59). The ull hypothesis of 49 miutes is ot i the 95% cofidece 61 iterval. Therefore, we coclude the average amout of sleep spet i stage IV i Alzheimer s patiets is less tha the geeral populatio mea. 2. The distributio of weights for the populatio of males i the Uited States is approximately ormal with mea µ = 172.2 ad stadard deviatio σ = 29.8 pouds. a) What is the probability that a radomly selected ma weighs less tha 130 pouds? z = 130 172.2 29.8 = -1.416. P(Z < -1.416) = 0.078 b) What is the probability that he weighs more tha 210 pouds? z = 210 172.2 29.8 = 1.268. P(Z > 1.268 ) = 0.102 c) What is the probability that amog five males selected at radom from the populatio, at least oe will have a weight outside the rage 130 to 210 pouds? 1 dbiom(0,5,0.18) = 0.629 3. Suppose a ew flu vaccie is developed for the state of Iowa; however, there are adverse side effects associated with it. We wish to test the hypothesis that the proportio of people who experiece a adverse effect is the same as it is with the stadard flu vaccie, which is 30%. Suppose we have a radom sample of size 1000, ad 270 people had a adverse effect to the ew flu vaccie. a) Set up the hypothesis test ad calculate the sample proportio. H0: p = 0.3 H 1 : p 0.3 p = 270/1000 = 0.27 b) Use biom.test(). Test the hypothesis stated i the problem at the 5% sigificace level ad calculate a cofidece iterval. What do you coclude? biom.test(270, 1000, 0.3) : pvalue = 0.038; 95% cofidece iterval is (0.243, 0.299). We have sufficiet evidece to coclude the true proportio of people who experiece a adverse effect for

this ew flu vaccie is ideed lower tha the true proportio of adverse effects from the stadard flu vaccie (p = 0.038). c) Use the ormal approximatio to fid the p-value ad cofidece iterval. z = 0.27 0.3 = -2.07 P(Z < -2.07)*2 = 2*porm(-2.07) = 0.038 0.3(1 0.3) 1000 95% C.I. = 0.27 ± 1.96 0.27(1 0.27) 1000 = (0.24, 0.298). We ca see it is very similar to the exact cofidece iterval ad we draw the same coclusio.