MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 1 MAT 2379, Introduction to Biostatistics Sample Calculator Problems for the Final Exam Note: The exam will also contain some problems based on a printed Minitab output. Sample problems of this format are not provided. Students are encouraged to review the Minitab instructions on the web, and the solutions to the Minitab assignment problems. 1. A sample of 25 seven-year-old girls had a sample mean weight of 43.1 pounds and a sample standard deviation of 6.6 pounds. Assuming normality, find a 90% confidence interval for the average weight µ of a seven-year-old girl. (A) [40.4; 45.8] (B) [41.3; 44.8] (C) [39.5; 46.1] (D) [40.8; 45.4] (E) [38.1; 47.8] Solution: The interval is x ± t(s/ n). In this case, t = 1.711 since P (T 24 > 1.711) = 0.05. We obtain the interval: ( ) 6.6 43.1 ± 1.711 25 that is 43.1 ± 2.258, i.e. [40.8414; 45.3585]. The answer is D. 2. The concentration of active ingredients in a liquid laundry detergent is thought to be affected by the type of catalyst used in the manufacturing process. Ten observations on concentration are taken with each catalyst, yielding the following the data: Catalyst 1 : 57.9, 66.2, 65.4, 65.4, 62.6, 67.6, 63.7, 67.2, 65.2, 71.0 Catalyst 2 : 66.4, 71.7, 70.3, 69.3, 64.8, 69.6, 68.6, 69.4, 65.3, 68.8 Let µ 1 and µ 2 be the average concentration of active ingredients, when using catalyst 1, respectively catalyst 2. Give a point estimate for µ 2 µ 1. (A) 2.3 (B) 3.2 (C) 2.0 (D) 1.3 (E) -3.2. Solution: We have x 1 = 65.22 and x 2 = 68.42. An estimate for µ 2 µ 1 is x 2 x 1 = 68.42 65.22 = 3.2. The answer is B. 3. The height of a male student is a random variable with a normal distribution with mean µ = 174.5cm and standard deviation σ = 6.75 cm. What is the probability that a randomly chosen student has a height of more than 188.3 cm? (A) 0.4772 (B) 0.0207 (C) 0.9772 (D) 0.4778 (E) 0.3451 Solution: Let X be the height of the student. By standardization, ( ) X 174.5 188.3 174.5 P (Z > 188.3) = P > = P (Z > 2.04) 6.75 6.75 The answer is B. = 1 P (Z 2.04) = 1 0.9793 = 0.0207
MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 2 4. In a sample of 300 Canadians, 27 have type B blood. Based on this information, find a 95% confidence interval for the proportion p of Canadians who have type B blood. (A) [0.063; 0.117] (B) [0.085; 0.095] (C) [0.058; 0.122] (D) [0.071; 0.109] (E) [0.011; 0.139] Solution: We have ˆp = 27/300 = 0.09. The interval is: 0.09 ± 1.96 (0.09)(0.91) 300 that is 0.09 ± 0.032, or [0.058; 0.122]. The answer is C. 5. Two companies manufacture a rubber material intended for use in automobiles. 25 samples of material produced by each company are tested in an abrasion test, and the amount of wear is observed. The sample means and sample standard deviations for company 1 and company 2 are summarized below: Company 1 : x 1 = 20 mg, s 1 = 2 mg, Company 2 : x 2 = 15 mg, s 2 = 8 mg. Construct a 99% confidence interval for µ 1 µ 2, where µ 1 and µ 2 are the average amounts of wear for the two companies. Assume that the two populations are normally distributed with unequal variances. Hint: Use the T distribution with γ = 26 degrees of freedom. (A) [0.42; 9.58] (B) [1.77; 8.23] (C) [2.03; 7.03] (D) [2.29; 7.71] (E) [3.14; 6.86] Solution: The populations are independent. The number of degrees of freedom of T is: ν = (4/25 + 64/25) 2 [(4/25) 2 + (64/25) 2 ]/24 = 26.98 We use γ = 26. From Table VI, we find t = 2.779, since P (T 26 < 2.779) = 0.995. Note that x 1 x 2 = 5 and the interval is: 4 5 ± (2.779) 25 + 64 25 that is 5 ± 4.58, or [0.42; 9.58]. The answer is A. 6. A patient s condition in a hospital is classified as surgical or medical. In the United States, some patients have health insurance coverage under a national plan called Medicare, while others do not have. The data obtained for 177 patients is given in the following table: Surgical Medical Medicare: yes 46 52 Medicare: no 36 43
MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 3 Is there an association between the patient s condition and the patient s heath insurance coverage? Give the range of the p-value of the test, and report your conclusion at level α = 0.05. (A) 0.1 <p-value< 0.2; there is no association (B) 0.025 <p-value< 0.05; there is an association (C) 0.05 <p-value< 0.10; there is no association (D) 0.10 <p-value< 0.20; there is no association (E) p-value> 0.20; there is no association Solution: We have to test H 0 which says that there is no association between the two variables. We fill-in the following table: The test statistic is: (46 45.4) 2 + 45.4 Surgical Medical Total Medicare: yes 46 52 98 (45.40) (52.60) Medicare: no 36 43 79 (36.30) (42.40) Total 82 95 177 (52 52.6)2 52.6 + (36 36.6)2 36.6 + (43 42.4)2 42.4 = 0.033 P-value= P (Y > 0.033) is greater than 0.20. Since the p-value is larger than 0.05, we fail to reject H 0. We conclude that there is no association. The answer is E. 7. 85% of the adults living in a region are non-smokers, and 24% are non-smokers with emphysema (a chronic lung disease). What is the probability that a randomly selected person has emphysema, given that this person is a non-smoker? (A) 0.15 (B) 0.182 (C) 0.625 (D)0.282 (E) 0.931 Solution: Let A be the event that the person is a non-smoker and B be the event that the person has emphysema. We know that P (A) = 0.85 and P (A and B) = 0.24. The desired probability is: The answer is D. P (B A) = P (A and B) P (A) = 0.24 0.85 = 0.282 8. Ten adult males between the age of 35 and 50 participated in a study to evaluate the effect of diet and exercise on lowering the blood cholesterol level. The cholesterol level was measured in each subject before and after participating in an aerobic exercise program. The data is given in the following table:
MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 4 Subject (i) Before (x i ) After (y i ) 1 265 229 2 240 231 3 258 227 4 295 240 5 251 238 6 287 234 7 314 256 8 260 247 9 283 246 10 240 218 The differences d 1 = x 1 y 1, d 2 = x 2 y 2,..., d 10 = x 10 y 10 have a sample mean d = 32.7 and sample standard deviation s d = 18.36. Construct a 95% confidence interval for the average difference µ d = µ X µ Y, where µ X and µ Y are the average cholesterol levels before the program, respectively after the program. Does the data support the claim that the program has been efficient in reducing the cholesterol level? (A) [-13.56;14.57]; not enough evidence (C) [-7.37; -1.28]; no (E) [19.57; 45.83]; yes (B) [13.13; 26.26]; yes (D) [7.38; 14.54]; yes Solution: The two populations are dependent. From Table 4, we find t = 2.262, since P (T 9 > 2.262) = 0.975. The interval is: ( ) 18.36 32.7 ± 2.262 10 that is 32.7 ± 13.13, or [19.57; 45.83]. Since the interval contains only positive values, we conclude that µ d > 0, i.e. µ X > µ Y. The program has been efficient. The answer is E. 9. A medical test is conducted to determine whether someone has a disease. Suppose that 8% of population has this disease. The test has a 95% chance of detecting the disease, if the person has the disease, and a 90% chance of indicating that the disease is absent, if the person does not have the disease. What is the probability that a person has the disease, given that this person has tested positive? (A) 0.52 (B) 0.32 (C) 0.45 (D) 0.90 (E) 0.12. Solution: Let D be the event that the person has the disease, T + be the event that the test is positive, and T the event that the test is negative. We know that P (D) = 0.08, P (T + D) = 0.95 and P (T D c ) = 0.90. From here, we infer that P (T + D c ) = 0.10. Using the addition rule, and the multiplicative rule, we get: P (T +) = P (T + and D) + P (T + and D c ) = P (D)P (T + D) + P (D c )P (T + D c ) = (0.08)(0.95) + (0.92)(0.10) = 0.076 + 0.092 = 0.168
MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 5 The desired probability is: The answer is C. P (D T +) = P (T + and D) P (T +) = 0.076 0.168 = 0.45 10. Let X be the number of visits to the dentist s office of a randomly chosen student. Suppose that X has a probability distribution function given by the following table: x 0 1 2 P (X = x) 0.15 0.5 0.35 Find P (X µ), where µ = E(X) is the average number of visits. (A) 0 (B) 0.15 (C)1 (D) 0.65 (E) 0.15 Solution: We have µ = E(X) = 0(0.15) + 1(0.5) + 2(0.35) = 1.2. The desired probability is: The answer is D. P (X 1.2) = P (X 1) = P (X = 0) + P (X = 1) = 0.15 + 0.5 = 0.65 11. A meteorologist measures the temperature (in degrees Celsius) in downtown Ottawa at noon on 5 consecutive days, and obtains the following data: 15, 10, 9, +1, +3 Based on this data, give an estimate for: (i) the average temperature µ at noon; and (ii) the variance σ 2 of the temperature at noon. (A) (i) 6.0; (ii) 59.0 (B) (i) 6.0; (ii) 54.4 (C) (i) 7.8; (ii) 76.2 (D) (i) 2.1; (ii) 54.4 (E) (i) 5.0; (ii) 36.4 Solution: (i) An estimate for the average temperature µ is x = (ii) An estimate for σ 2 is: ( 15) + ( 10) + ( 9) + (+1) + (+3) 5 = 6. s 2 = 1 4 {[ 15 ( 6)]2 + [ 10 ( 6)] 2 + [ 9 ( 6)] 2 + [+1 ( 6)] 2 + [+3 ( 6)] 2 } = 59.0 The answer is A. 12. In a recent poll of 1,215 Canadians, 643 are in favor of Canada withdrawing its troops from Afghanistan. Give a point estimate of the percentage p of Canadian who are not in favor of Canada withdrawing its troops from Afghanistan. (A) 0.53 (B) 0.42 (C) 0.47 (D) 0.55 (E) 0.50 Solution: The estimate for p is ˆp = (1215 643)/1215 = 572/1215 = 0.47. The answer is C.
MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 6 13. 10% of the residents of a certain city test positive for a certain antibody. What is the probability that in a randomly selected group of 5 residents, at least one will test positive? (A) 0.4095 (B) 0.5905 (C) 0.5 (D) 0.9040 (E) 0.7651 Solution: Let X be the number of people who will test positive. X has a binomial distribution with n = 5 and p = 0.1. The desired probability is: The answer is A. P (X 1) = 1 P (X = 0) = 1 (0.9) 5 = 1 0.5905 = 0.4095 14. A population includes 40% females and 60% males. For the women, the probability of having a hand size smaller than 100 cm 2 is 0.31. For the men, this probability is 0.08. What is the probability that a randomly chosen person will have a hand size smaller than 100 cm 2? (A) 0.352 (B) 0.452 (C) 0.172 (D) 0.202 (E) 0.912. Solution: Let H be the event that the person has a hand smaller than 100 cm 2. We know that P (H F ) = 0.31 and P (H M) = 0.08. Using the rule of total probability, The answer is C. P (H) = P (H F )P (F ) + P (H M)P (M) = (0.31)(0.4) + (0.08)(0.6) = 0.172 15. Suppose A and B are events such that P [A] = 0.2, P [B] = 0.6 and Are A and B independent events? P [A or B] = 0.68. (A) Yes (B) No (C) Insufficient information given Solution: Note that P [A and B] = P [A] + P [B] P [A or B] = 0.2 + 0.6 0.68 = 0.12 P [A]P [B] = (0.2)(0.6) = 0.12 Since P [A and B] = P [A]P [B], we conclude that the events A and B are independent. 16. Suppose that A and B are events such that P [A] = 0.6, P [B] = 0.4 and P [A and B] = 0.3. Find P [B A c ], the probability that B occurs given that A does not occur. (A) 0.5 (B) 0.1 (C) 0.4 (D) 0.25 (E) insufficient information given
MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 7 Solution: We have P [A c ] = 1 P [A] = 1 0.6 = 0.4 and P [B and A c ] = P [B] P [A and B] = 0.4 0.3 = 0.1. Hence P [B A c ] = P [B and Ac ] P [A c ] = 0.1 0.4 = 0.25 17. There are 500 students writing this examination. Experience tells us that the marks on the exam are normally distributed with mean 70 and standard deviation 10. What is the expected number of students that will have a mark of 80 or more on this exam? (A) 70 (B) 79.35 (C) 15.87 (D) 84.13 (E) 420.65 Solution: Let X be the grade of a randomly chosen student. The probability that this student s grade is greater than 80 is [ ] X 70 80 70 P [X 80] = P = P [Z 1] = 1 0.8413 = 0.1587 10 10 Let Y be the number of students (among the 500) which will have a grade greater than 80. Then Y : Binomial(500, 0.1587) and E[Y ] = (500)(0.1587) = 79.35. 18. Let X 1,..., X 7 be a random sample from a normal distribution with mean 5. Let X and S be the sample mean and sample standard deviation, respectively. Find a constant c such that ( ) X 5 P S/ 7 < c = 0.01 (A) 2.576 (B) -2.576 (C) -2.998 (D) 2.998 (E) -3.143 Solution: The variable Y = ( X 5)/[S/ 7] has a t distribution with 6 d.f. From Table 4, we find P [Y 3.143] = 0.01 and hence c = 3.143. 19. The blood pressure of a male between 50 and 60 is known to be normally distributed and to have a standard deviation of 10 mm. Hg. The blood pressure of 36 men (all between 50 and 60) is measured. The average pressure in the sample was observed to be 118 mm. Hg., and the sample standard deviation was 12 mm. Hg. Find a 95% confidence interval for the mean blood pressure µ of men aged 50-60. (A) 118 ± 2.74 (B) 118 ± 3.27 (C) 118 ± 4.06 (D) 118 ± 3.38 (E) 118 ± 0.54 Solution: Because σ = 10 is know, we use the confidence interval based on z. From Table 3 we read z = 1.96. The confidence interval is ( ) 10 118 ± (1.96) or equivalently 118 ± 3.27 6
MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 8 20. It is known that monozygotic twins occur with probability 2/9 and dizygotic twins occur with probability 7/9. In the monozygotic case the probability that both twins are male is 1/2, whereas in the dizygotic case this probability is 1/4. What is the probability that in a randomly selected pair of twins, both are males? (A) 1/4 (B) 7/36 (C) 5/18 (D) 11/36 (E) 1/2 Solution: Let A be the event that the twins are monozygotic, and B the event that the both twins are male. We know that P [A] = 2/9, P [A c ] = 7/9, P [B A] = 1/2 and P [B A c ] = 1/4. Hence P [B] = P [A] P [B A] + P [A c ] P [B A c ] = 2 9 1 2 + 7 9 1 4 = 11 36 21. Continuing with the situation described in Problem 20, if a couple has a pair of male twins, what is the probability that these twins are monozygotic? (A) 4/11 (B) 4/9 (C) 15/36 (D) 7/18 (E) 2/9 Solution: Using the answer calculated in Problem 20, we get P [A B] = P [A] P [B A] P [B] = (2/9) (1/2) 11/36 = 4 11 22. The following table gives the cumulative distribution function of a discrete random variable X. Find the expected value of X. x 5 6 7 10 F (x) 0.4 0.6 0.9 1.00 (A) 7.5 (B) 6.3 (C) 21.9 (D) 5.8 (E) 8.1 Solution: We have the following table of the density function of X: The expected value of X is x 5 6 7 10 f(x) 0.4 0.2 0.3 0.1 E[X] = (5)(0.4) + (6)(0.2) + (7)(0.3) + (10)(0.1) = 2 + 1.2 + 2.1 + 1 = 6.3 23. The following graph represents the density function of a continuous random variable X. Calculate the height h on the graph. (a) h = 1/3 (b) h = 1/4 (c) h = 2/3 (d) h = 7/8 (e) h = 1/2 h 0 3
MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 9 Solution: The total area under the graph is equal to 1. We have 1 = Area(triangle) = h 3 2 and hence h = 2/3. 24. A study is conducted to compare the effect of green light, in comparison to red light, on the growth of bean plants. The following table shows data on the heights of plants (in inches) after two weeks. Test the hypothesis H 0 : µ 1 = µ 2 versus H A : µ 1 > µ 2. What are the observed value of the test statistic and the p-value? Green Light Red Light n 1 = 9 n 2 = 5 x 1 = 9.8 x 2 = 7.3 s 2 1 = 5.4 s 2 2 = 2.5 (A) The test statistic is 4.52; the p-value is smaller than 0.0005 (B) The test statistic is 1.43; the p-value is in the interval [0.05; 0.10] (C) The test statistic is 2.38; the p-value is in the interval [0.01; 0.02] (D) The test statistic is 1.27; the p-value is larger than 0.10 (E) none of the preceding Solution: The number of degrees of freedom is ν = (0.6 + 0.5) 2 (0.36)/8 + (0.25)/4 = 1.21 = 11.25 ; we take ν = 11. 0.1075 The observed value of the test statistic is The p-value lies in the interval [0.01 ; 0.02]. 9.8 7.3 = 2.5 = 2.38. 0.6 + 0.5 1.1 25. The maximum temperature (in degrees Fahrenheit) on December 1st in Boston has a normal distribution with mean µ = 40 and variance σ 2 = 100. We know that x degrees Fahrenheit equals 5(x 32)/9 degrees Celsius. What is the probability of having a maximum temperature of at most 0 degrees Celsius on December 1st in Boston? (A) 0.2119 (B) 0.5000 (C) 0.5320 (D) 0.7881 (E) less than 0.0002 Solution: 5(X 32) P ( 9 0) = P (X 32) = P ( X 40 10 = P (Z 0.8) = 0.2119. 32 40 ) 10 26. In the USA, 30% of adults have been tested for HIV. In the emergency room of a small American hospital, it is standard procedure to test all patients for HIV if they have not already been tested. If 15 patients are seen in the emergency room today, what is the probability that at least 10 will need to be tested?
MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 10 (A) 0.0007 (B) 0.2784 (C) 0.5155 (D) 0.7216 (E) 0.0039 Solution: The probability that a person have not already been tested in 0.7. Let X be the number of persons (among the 15) which have not already been tested for HIV. Then X : Binomial(15, 0.7). From Table I, the desired probability is P [X 10] = 1 P [X 9] = 1 0.2784 = 0.7216. 27. In the double cohort year, the students in first year calculus had different mathematics courses in their final year of high school. It was of interest to see if there was an association between their high school math courses (A and B) and the likelihood of passing first year calculus. A sample of 93 first year students yielded the following data: Course A Course B Total Passed Calculus 34 16 50 Failed Calculus 19 24 43 Total 53 40 93 Test the hypothesis H 0 that there was no association between high school math and first year calculus against the alternative H A that there was an association. Give the range of the p-value of the test, and state your conclusion at the level α = 0.05 (A) 0.005 <p-value< 0.01, reject H 0, there is some association (B) 0.01 <p-value< 0.02, reject H 0, there is some association (C) 0.02 <p-value< 0.05, reject H 0, there is some association (D) 0.05 <p-value< 0.1, fail to reject H 0, no association (E) 0.2 <p-value, fail to reject H 0, no association Solution: We fill up the contingency table as follows: The observed value of the test statistic is: χ 2 = Course A Course B Total Passed Calculus 34 16 50 (28.5) (21.5) Failed Calculus 19 24 43 (24.5) (18.5) Total 53 40 93 (34 28.5)2 28.5 + (16 21.5)2 21.5 + (19 24.5)2 24.5 + (24 18.5)2 18.5 30.25 28.5 + 30.25 21.5 + 30.25 24.5 + 30.25 18.5 = 5.34 The p-value=p (Y > 5.34) lies in the interval [0.02; 0.05]. We reject H 0 at level 0.05. We decide that there is some association between the two variables. =
MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 11 28. The effectiveness of a new method of teaching semi-literate adults to read is being assessed. A group of 15 such adults is randomly selected and a reading test is administered to evaluate their reading levels. The resulting sample mean is x 1 = 6.7 with sample standard deviation s 1 = 1.8. The 15 people then undergo the new teaching method and are re-tested, with x 2 = 7.2 and s 2 = 2.1. We want to test H 0 : µ 1 µ 2 0 against H 1 : µ 1 µ 2 < 0, at level α = 0.05. Which of the following statements is correct? (A) t = 0.700, reject H 0 (B) t = 0.700, do not reject H 0 (C) There is not enough information provided to carry out the appropriate test procedure. Solution: This is a paired comparison test. In order to perform the test we need the value of s d (the standard deviation of the differences) and this value is not given. The correct answer is C. 29. Air Canada is trying to determine the proportion p of no-shows (i.e. the proportion of passengers who book flights but who do not show up). In a three day period, 850 out of 10,000 booked passengers did not show up for their flights. Find a 95% confidence for p. (A) [0.0795, 0.0905] (B) [0.0804, 0.0896] (C) [0.9104, 0.9196] (D) [0.9095, 0.9205] (E) none of the preceding. Solution: ˆp = 850/1000 = 0.085. The interval is 0.085 ± 1.96 that is 0.085 ± 0.0055, or [0.0795; 0.0905]. (0.085)(0.915) 30. The coach of the University of Ottawa weight-lifting team wants the 10 men to increase their mean body weight, and so he puts them on a protein diet for two months and then records their weights. If it can be assumed the the weights are still normally distributed with standard deviation 10, find a 95% confidence interval for the mean weight of a weight-lifter on the protein diet if the weights yield the following data: 1000 x = 105, s = 11. (A) 105 ± 1.96 (B) 105 ± 0.72 (C) 105 ± 2.26 (D) 105 ± 0.52 (E) 105 ± 6.20 Solution: Since σ = 10 is known, we use the formula x ± z(σ/ n). The interval is: that is 105 ± 6.20. 105 ± 1.96 10 10
MAT 2379, Introduction to Biostatistics, Sample Calculator Questions 12 31. Triglycerides are blood constituents that are thought to play a role in coronary artery disease. To see whether regular exercise could reduce triglyceride levels, researchers measured the concentration of triglycerides in the blood serum of 5 male volunteers, before and after participation in a ten-week exercise program. The results are shown in the table below: Subject Before (x 1 ) After (x 2 ) 1 2.1 1.9 2 1.8 1.7 3 2.4 2.5 4 1.6 1.3 5 2.0 1.5 We want to test H 0 : µ 1 = µ 2 against H 1 : µ 1 > µ 2, at level α = 5%. Which of the following statements is correct? (A) t = 2.00, reject H 0 (B) t = 2.00, do not reject H 0 (C) t = 0.49, reject H 0 (D) t = 0.49, do not reject H 0 (E) none of the preceding Solution: In this case, the populations are dependent. We calculate the column of differences d 1, d 2, d 3, d 4, d 5. For these differences, we obtain: d = 0.2 and sd = 0.2236. The test statistic is: d 0 s d / n = 0.2 0 0.2236/ 5 = 2.00 p-value= P (T 4 > 2.00) lies in the interval 0.05 and 0.10. We do not reject H 0. 32. If the serum cholesterol levels for a certain population are approximately normally distributed with a mean of 200 mg/100 ml and a standard deviation of 20 mg/100 ml, find the probability that a randomly chosen individual from this population will have a cholesterol level greater than 225 mg/100 ml. (A) 0.1056 (B) 0.0885 (C) 0.2573 (D) 0.1374 (E) less than 0.0029 Solution: Let X be the serum cholesterol level. By standardization, ( X 200 P (X > 225) = P > 20 ) 225 200 = P (Z > 1.25) 20 = 1 P (Z 1.25) = 1 0.8944 = 0.1056