DAWSON COLLEGE DEPARTMENT OF MATHEMATICS 201-BZS-05 PROBABILITY AND STATISTICS FALL 2015 FINAL EXAM Name: Date: December 24th, 2015 Studet Number: Time: 9:30 12:30 Grade: / 116 Examier: Matthew MARCHANT Istructios: 1. No books or otes are permitted. 2. Oly calculators without text storage ad graphical capability are permitted. 3. Please show all your work clearly. 4. Please justify all your aswers. 5. Cheatig will result i a miimum pealty of zero i your exam grade. 6. Uless otherwise stated, roud your aswer to 2 decimal places. 1
1. [10 marks] The commutig distace was determied for each of 10 employees at Acme Maufacturig. Oe of the employees lives i aother tow ad has a large commutig distace. The 10 distaces were as follows: 5, 10, 7, 15, 10, 12, 8, 120, 20, 18 a. Sketch the dot plot (use employee umber as the x-axis) b. Fid the mea. By how much does the outlier affect the mea? c. What statistic do you expect is more represetative of the populatio variability, the stadard deviatio or the iterquartile mea? Why? a. 140 120 100 80 60 40 20 0 Distace from work 0 2 4 6 8 10 12 b. mea22.6, mea without outlier11.78 c. Iterquartile mea because it is a robust statistic ad is less sesitive to outliers ad we have a outlier which is the value 120 miutes. 2. [5 marks] A quality-cotrol techicia selects 120 assembled parts from a assembly lie ad records the followig iformatio cocerig these part: A: defective or o-defective B: the employee umber of the idividual who assembled the part C: the weight of the part a. What is the populatio? b. What is the sample? c. Give the types of the three variables (ie. Quatitative/Cotiuous). a. All of the parts that have ever bee produced i the factory. b. The 120 parts selected from the assembly lie c. A: qualitative ordial B: qualitative omial C: quatitative cotiuous 3. [15 marks] A pair of fair dice is rolled oce. Let E the evet of a sum of 8 Let F the evet of a product of 15 2
Let G the evet of doubles Let H the evet where a 4 ad a 1 are obtaied a. Draw the Ve diagram for the 4 evets b. Fid: P(E), P(F), P(G) ad P(H) c. Fid: P E F, P E G, P E H, P(E G) d. Are E ad H idepedet? Why? e. Are E ad G idepedet? Why? f. Are E ad G mutually exclusive? Why? g. Are H ad E mutually exclusive? Why? 1,1 1,2 1,3 1,4 1,5 1,6 2,1 2,2 2,3 2,4 2,5 2,6 3,1 3,2 3,3 3,4 3,5 3,6 4,1 4,2 4,3 4,4 4,5 4,6 5,1 5,2 5,3 5,4 5,5 5,6 6,1 6,2 6,3 6,4 6,5 6,6 b. P(E)5/36 P(F)2/361/18 P(G)6/361/6 P(H)2/361/18 c. P(E ad F)2/361/18 P(E ad G)1/36 P(E ad H)0 P(E give G)1/6 d. No, mutually exclusive evets caot be idepedet. e. No, P(E give G) is ot equal to P(E) f. No, P(E ad G) is ot equal to 0. g. Yes, P(E ad H) is equal to 0. 3
4. [10 marks] A compay has 10 idetical machies that produce ails idepedetly. The probability that a machie will break dow is 0.1. Defie a radom variable X to be the umber of machies that will break dow i a day. a. What is the appropriate probability distributio for X? b. Give the expressio for the probability that r machies will break dow. c. Compute the probability that at least 1 machies will break dow. d. What is the expected umber of machies that will break dow? e. What is the variace of the umber of machies that will break dow? a. Biomial b. P(Xr)C(10,r)p^r x q^(10-r) c. P(X>1)1-P(X<1)1-P(X0) 0.65 d. E(X) 1 e. Var(X) 0.9 5. [12 marks] Participats of a study with siusitis received either a atibiotic or a placebo ad were asked at the ed of a 10-day period if their symptoms had improved. The resposes are summarized i the table below: a. Commet o whether or ot we ca make a causal statemet. b. Set up hypotheses (Give Ho ad Ha) to test whether the proportio of patiets who reported sigificat improvemet i symptoms is greater i the treatmet group tha i the cotrol group. c. Assume that the test statistic follows the Normal distributio ad obtai i) the SE ii) the test statistic iii) p-value ad iv) complete the hypothesis test at the 0.05 level of sigificace. a. Yes, it's a experimet b. Ho: p_treat-p_cotrol0 Ha: p_treat-p_cotrol>0 c. p_treat_hat 66/85 0.78 p_cotrol_hat 65/81 0.80 p_treat_hat - p_cotrol_hat -0.026 p_hat_pooled (0.78*85 + 0.80*81)/(85+81) 0.79 SE 0.063 Z*-0.41 p_value(oe sided) : 0.66, Not less tha alpha0.05. Fail to reject Ho at 0.05 L.O.S. 4
6. [12 marks] The followig sample data pertai to the shipmets received by a large firm from three differet vedors. Test at the 0.01 level of sigificace whether the quality level of the items received ad the vedor are idepedet. Number rejected Number imperfect but acceptable Number perfect Vedor A 12 23 89 124 Vedor B 8 12 62 82 Vedor C 21 30 119 170 41 65 270 376 a. State the hypotheses. b. Check the coditios. c. Obtai the test statistic. d. Obtai the p-value ad state the coclusio. a. Ho: Vedor ad quality are idepedet Ha: Vedor ad quality are ot idepedet b. Coditios: 1. Hard to say if we have less tha 10% of populatio, we will assume that we do. 2. Expected values all greater tha 5 3. df greater tha 2. c. Chi-SquareValue: 1.3 d. p_value: >0.3, greater tha 0.01, therefore fail to reject Ho at 0.01 L.O.S. 7. [8 marks] Suppose the probability of havig the disease is 0.001. If a perso has the disease, the probability of a positive test result is 0.90. If a perso does ot have the disease, the probability of a egative test result is 0.95. For a perso selected at radom from the populatio, what is the probability they are ifected give they have tested positive? 0.018 Note: Use the followig otatio: dis: a perso selected at radom has the disease dıs: a perso selected at radom does ot have the disease pos: a perso selected at radom tests positive for the disease pos: a perso selected at radom tests egative for the disease 5
8. [10 marks] Cosider the followig data for the time to commute to work for 10 employees: Time to Employee commute (mi) 1 11.3 2 14.7 3 16.4 4 16.5 5 17 6 19.9 7 22.3 8 23.3 9 26.1 10 26.2 a. Obtai the class width (use 4 classes or bis). b. Add a colum for frequecy. c. Add a colum for relative frequecy. d. Sketch the relative frequecy histogram. a. 3.725 bi b. c. bi upper boudaries freq rel_freq 1 15.3 2 0.2 2 19.025 3 0.3 3 22.75 2 0.2 4 26.475 3 0.3 9. [8 marks] You are iterested i learig about opiios of Dawso studets about a proposed climate chage policy. a. Explai how you would use stratified samplig to obtai sample data. b. Explai how you would use cluster samplig to obtai sample data. c. Which of the two methods, stratified or cluster samplig will take less time? d. Which of the two methods, stratified or cluster samplig is likely to give a more represetative sample? a. Form stratas. Oe possibility is to use the programs studets are erolled i. Perform simple radom samplig withi these strata. b. Form clusters (groupigs that are self similar) ad perform simple radom samplig to select clusters ad the perform simple radom samplig withi these clusters. 6
c. Cluster samplig would take less time as we would sample fewer studets. d. Stratified samplig because we are usig more samples ad ot leavig out groups which is risky. 10. [8 marks] A studet receives emails accordig to a Poisso distributio with a average of 53.5 e-mails every week. a. Calculate the probability that the studet receives exactly 115 e-mails i a 15-day period. b. Calculate the probability that the time i betwee two emails is greater tha 2 hours. a. 0.0372 b. 0.529 11. [6 marks] Idetify the outliers i the scatterplots show below, ad determie if they are ifluetial. Explai your reasoig. a. Certaily ifluetial as the slope has bee strogly affected by the outlier. This is because it a high leverage poit (big horizotal distace from the ceter of the data set). b. Not at all ifluetial because it falls directly o the lie of best fit for the cluster furthest to the left. c. Not ifluetial because small leverage (small horizotal distace from the ceter of the data set). 7
12. [12 marks] Researchers studyig athropometry collected body girth measuremets ad skeletal diameter measuremets, as well as age, weight, height ad geder for 507 physically active idividuals. The scatterplot below shows the relatioship betwee height ad shoulder girth (over deltoid muscles), both measured i cetimeters. The mea shoulder girth is 108.20 cm with a stadard deviatio of 10.37 cm. The mea height is 171.14 cm with a stadard deviatio of 9.41 cm. The coefficiet of liear correlatio betwee height ad shoulder girth is 0.67. c. Write the equatio of the regressio lie for predictig height. d. Iterpret the slope ad the itercept i this cotext. e. Calculate R 2 of the regressio lie for predictig height from shoulder girth, ad iterpret it i the cotext of the applicatio. f. A radomly selected studet from your class has a shoulder girth of 100 cm. Predict the height of this studet usig the model. g. The studet from part (d) is 160 cm tall. Calculate the residual, ad explai what this residual meas. h. A oe year old has a shoulder girth of 56 cm. Would it be appropriate to use this liear model to predict the height of this child? a. y_hat 105.4 + 0.61x b. Whe shoulder girth is 0 the height is approx. 105.4 cm (which does't make physical sese) For a icrease i 1 cm of shoulder girth, the height should icrease by 0.61 cm. 8
c. R^2 0.4, is the reductio i variability by the regressio lie. d. Y_hat(100)166.15 e. E6.15: Vertical distace betwee observatio (160cm) ad predicted value (166.16) f. Y_hat(56)139.41. Greater tha the typical height of a oe-year-old. Applyig the liear model for x-values beyod the x-values i the data set is ot recommeded. Formulas: x i1 x i S 2 1 1 i1 ( x i x) 2 ( A) P ( A) ( S) P ( A B) P(A) + P(B) P(A B) ( ) ( A B) P( A B) P A B P( A B) ( B) P( B) P ( A B) P( A) P( B A) P B A P(A) P A B P(B) x i1 x i ( A) P ( A) ( S) S 2 1 1 i1 P ( A B) P(A) + P(B) P(A B) ( ) ( A B) P( A B) P A B P( A B) ( B) P( B) P ( A B) P( A) P( B A) P B A P(A) P A B P(B) ( x i x ) 2 μ E x xp(x) 566 8 σ : E x μ : x μ : P(x) 566 8 9
P X k C, k p @ q BC@ μ p, σ : pq C(k,x)!!C(N k, x) P(x)! C(N,) µ k N!!!,!!!!σ k(n k)(n ) 2! N 2 (N 1) P X k α k k! e α b ( ) P( a < x < b) f ( x) P a x b! µ E X! ( ) 2! σ 2 E x µ ( ) xf x dx f x λecg8 x 0 0 x < 0 SE σ S x C.I. : poit estimate ± Z * SE Test statistic: Z E X 2 µ2 a dx poit estimate - ull value SE SE x1 s 2 1 + s 2 2 x 2 1 2, SE ˆp p(1 p) SE LM CL N χ 2 k i1 SE : LM + SE : LN L M(PCL M ) + L N(PCL N ) B M B N ( O i E i ) 2, df k 1 E i, ( ) χ 2 ( O ij E ij ) 2 ŷ i b 0 + b 1 x b y b x b 1 s y R i s 0 1 x ij E ij, p L MB M QL N B N B M QB N,, df R 1 C 1 10