PART I. (a) Describe all the assumptions for a normal error regression model with one predictor variable,

Concordia University Department of Mathematics and Statistics Course Number Section Statistics 360/2 01 Examination Date Time Pages Final December 2002 3 hours 6 Instructors Course Examiner Marks Y.P. Chaubey Y.P. Chaubey 60 Special Instructions: CLOSED BOOK EXAM 1. Calculators are permitted. 2. Full Credit will be given only for systematic and detailed work. 3. Tables needed are given on the last page 4. Answer THREE questions from PART I and THREE questions from PART II. [2+4+4 ]Q 1. PART I (a) Describe all the assumptions for a normal error regression model with one predictor variable, Y = β 0 + β 1 X + ɛ. (b) Describe the least square principle for estimation of the parameters (β 0, β 1 ), given the data points (X i, Y i ), i = 1,..., n. Show that using this principle provides the following estimators of (β 0, β 1 ), respectively; b 0 = Ȳ b X 1 b 1 = ni=1 (X i X)(Y i Ȳ ) ni=1 (X i X) 2 (c) The following data (in coded form) were collected in studying the effect of temperature (X) on the yield (Y ) of a chemical process: X -5-4 -3-2 -1 0 1 2 3 4 5 Y 1 5 4 77 10 8 9 13 14 13 18 Assuming a simple linear model, find the prediction equation. You may use the following computations: Xi = 0, Yi = 102, Xi Y i = 158, X 2 i = 110 Y 2 i = 1194. 1

[4+6 ]Q 2. [5+5 ]Q 3. [6+4 ]Q 4. Stat 360/4 Final Examination December 2002 Page 2 of 6 (a) Prove that where b 1 N(β 1, σ2 S xx ), n S xx = (X i X) 2. i=1 (b) Consider the normal error regression model given by Y i = βx i + ɛ i, i = 1,..., n, where the notations have the usual meaning. Show that the Maximum Likelihood Estimator β and σ 2 are respectively given by i X i Y i b = i Xi 2 ˆσ 2 = 1 (Y bx i ) 2 n i (a) Define Sum of Squares SST O, SSR and SSE. Prove that and hence derive the fact that SSR = b 2 1S xx E(SSR) = σ 2 + β 2 1S xx. (b) Prepare the ANOVA table for the data in Q. 1(c) and use it to test if the variable temperature should be retained in the model. Use a 1% level of significance. (a) Prove the Bonferroni inequality P (A C 1 A c 2) 1 P (A 1 ) P (A 2 ) and hence justify the following joint confidence interval for (β 0, β 1 ); β i b i Bs{b i }; i = 0, 1. where B = t(1 α 4 ; n 2) 2

Stat 360/4 Final Examination December 2002 Page 3 of 6 (b) A person s muscle mass (Y) is expected to decrease with age (X). To explain this relationship in women, a nutritionist randomly selected 10 women aged 40-79 yrs. The following results were obtained. Y 82.0 91.0 100.0 68.0 87.0 73.0 78.0 80.0 65.0 84.0 X 71.0 64.0 43.0 67.0 56.0 73.0 68.0 56.0 76.0 65.0 The following output is obtained using MINITAB software Regression Analysis The regression equation is Y = 136-0.856 X Predictor Coef StDev T P Constant 135.53 14.86 9.12 0.000 X -0.8565 0.2302-3.72 0.006 S = 6.784 R-Sq = 63.4% R-Sq(adj) = 58.8% Analysis of Variance Source DF SS MS F P Regression 1 637.40 637.40 13.85 0.006 Residual Error 8 368.20 46.03 Total 9 1005.60 Would you agree with the original hypothesis? If so, give a 90% confidence interval for average decrease in muscle mass per year. 3

[5+5 ]Q 5. Stat 360/4 Final Examination December 2002 Page 4 of 6 PART II (a) Let Y be a random vector with n components and A be a matrix of constants of order m n, using the matrix methods, show that the mean E{W} and variancecovariance matrix σ 2 {W} of W = AY is given by E{W} = AE{Y} σ 2 {W} = Aσ 2 {Y}A (b) Use the above results to prove that following; (i) E{b} = β. (ii) σ 2 {b} = σ 2 (X X) 1. [5+5 ]Q 6.The following are the sample data provided by a moving company on the weights of six shipments, the distances they were moved and the damage that was occured. Weight(1,000 lbs) Distance(1,000 miles) Damage($) i X i1 X i2 Y i 1 4.0 1.5 160 2 3.0 2.2 112 3 1.6 1.0 69 4 1.2 2.0 90 5 3.0 2.2 122 6 4.0 1.5 186 Assume that the regression model Y i = β 0 + β 1 X i1 + β 2 X i2 + ɛ i fits the data. Using the method of matrices, obtain the following: (i) Vector of estimated Regression Coefficients, i.e. b. (ii) Estimated variance Covariance Matrix of b. You can use the values of (X X) 1 and X Y. as given below: 3.80427 0.38533 1.47616 739.0 (X X) 1 = 0.38533 0.14379 0.00997, X Y = 2304.4 1.47616 0.00997 0.86774 1282.8 4

Stat 360/4 Final Examination December 2002 Page 5 of 6 [5+5 ] Q 7. Refer to the data in Q. 4(b) to answer the following questions: (a) Assuming the linear model to be appropriate obtain 95% simultaneous confidence intervals β 0 and β 1. (b) Find 90% Bonferroni and Sheffé simultaneous prediction intervals for muscle mass for woman aged 40, 50 and 60 years. [5+5 ] Q 8. The following data is obtained from a small mail order clearing house, where Y represents the number of parcels dispatched, X 1 represents the number of employees and X 2 represents the number of men. Y X1 X2 50 1 0 110 2 0 90 2 0 150 3 0 140 3 0 180 3 0 190 4 1 310 6 0 330 6 0 340 7 1 360 8 3 380 10 6 360 10 6 A statistician suggested to use the multiple regression model E(Y ) = β 0 +β 1 X 1 +β 2 X 2 with the following ANOVA output: Analysis of Variance Source DF SS MS F P Regression 2 168031 84016 474.98 0.000 Residual Error 10 1769 177 Total 12 169800 5

Stat 360/4 Final Examination December 2002 Page 6 of 6 (a) Comment on the goodness of fit of this model. (b) The manager asked the statistician that since X 2 is already included in X 1, why couldn t it be taken out of the equation? Provide an answer to satisfy the manager. You may use the following sequential sum of squares. Source DF Seq SS X1 1 155258 X2 1 12773 NOTE: 1. t(a; ν) is defined by P [t ν > t(a; ν)] = 1 A. 2. F (A; ν 1, ν 2 ) is defined by P [F ν1,ν 2 > F (A; ν 1, ν 2 )] = 1 A. Values of t(a; ν) A ν 0.900.950 0.975 0.980 0.985 0.990 0.995 4 1.5332 2.1318 2.7764 2.9985 3.2976 3.7469 4.6041 5 1.4759 2.0150 2.5706 2.7565 3.0029 3.3649 4.0321 6 1.4398 1.9432 2.4469 2.6122 2.8289 3.1427 3.7074 7 1.4149 1.8946 2.3646 2.5168 2.7146 2.9980 3.4995 8 1.3968 1.8595 2.3060 2.4490 2.6338 2.8965 3.3554 9 1.3830 1.8331 2.2622 2.3984 2.5738 2.8214 3.2498 10 1.3722 1.8125 2.2281 2.3593 2.5275 2.7638 3.1693 18 1.3304 1.7341 2.1009 2.2137 2.3562 2.5524 2.8784 19 1.3277 1.7291 2.0930 2.2047 2.3456 2.5395 2.8609 20 1.3253 1.7247 2.0860 2.1967 2.3362 2.5280 2.8453 Values of F (A; ν 1, ν 2 ) A ν 1 = 1, ν 2 = 6 ν 1 = 1, ν 2 = 8 ν 1 = 1, ν 2 = 9 ν 1 = 2, ν 2 = 8 ν 1 = 2, ν 2 = 10 0.900 3.7759 3.4579 3.3603 3.1131 2.9245 0.950 5.9874 5.3177 5.1174 4.4590 4.1028 0.975 8.8131 7.5709 7.2093 6.0595 5.4564 0.980 9.8764 8.3895 7.9605 6.6366 5.9336 0.985 11.3723 9.5180 8.9892 7.4298 6.5812 0.990 13.7450 11.2586 10.5614 8.6491 7.5514 0.995 18.6350 14.6882 13.6136 11.0424 9.4270 6