STAT 360-Linear Models Instructor: Yogendra P. Chaubey Sample Test Questions Fall 004 Note: The following questions are from previous tests and exams. The final exam will be for three hours and will contain 8 questions of similar nature as given here; only SIX questions have to be answered. The required tables will be provided. [3++5] Q 1. In an experiment to study the effect of Ozone pollution on soybean yield, levels of ozone exposure were measured over the growing season and the following data were collected. The dose of ozone is the average concentration (parts per million, ppm) during the growing season and the yield is reported in grams per plant. X Y Ozone (ppm) Yield (gms/plt).0 4.07 37.11 31.15 01 Xi = 0.35 Yi = 911 X i =.0399 Xi Y i = 76.99 Y i = 08, 495 Using a linear regression function E{Y } = β 0 + β 1 X, the least square estimates of β 0 and β 1 are, respectively given by b 0 = 53.43 b 1 = 93.531. Compute the residuals in the above data and prepare residual plots to determine the following departures from the model. (a) The model is non-linear. (b) The error variance is not constant. (c) The errors are not normally distributed. 1
Stat 360/ Sample Questions-Final Examination December 004 Page of 7 [4+3+3] Q. (a) Describe the normal regression model through origin. Show that the least square estimator of the regression parameter is given by b = ni=1 X i Y i ni=1. Xi (b) Show that where b N(β, σ {b}) σ {b} = σ ni=1 X i (c) Prove the Bonferroni inequality P (A C 1 A C ) 1 P (A 1 ) P (A ) and hence justify that the region in the (β 0, β 1 ) plane given by β i b i Bs{b i }, i = 0, 1, where B = t(1 α 4 ; n ) provides a 100(1 α)% joint confidence region for (β 0, β 1 ) in context of the simple linear model. [4+4+ ]Q 3. A substance used in biological and medical research is shipped by airfreight users in cartons of 1,000 ampules. The data below, involving 10 shipments were collected on the number of times the carton was transferred from one aircraft to another over the shipment route (X) and the number of ampules found to be broken upon arrival (Y ). i : 1 3 4 5 6 7 8 9 10 X i : 1 0 0 3 1 0 1 0 Y i : 16 9 17 1 13 8 15 19 11 Assume that a first order regression model is appropriate to answer the following questions.
Stat 360/ Sample Questions-Final Examination December 004 Page 3 of 7 (a) Set up the matrices X and Y and compute the estimator of the regression parameter using the matrix methods; [ ] 10. b = (X X) 1 X Y = 4.0 (b) Use matrix methods to reproduce the following ANOVA table for the above data: Source DF SS MS F Regression 1 160.00 160.00 7.73 Residual Error 8 17.60.0 Total 9 177.60 (c) Determine s {b 0 },s {b 1 } and s{b 0, b 1 } using the matrix formula for s {b}. Use these to compute s {Ŷh} for X h = 0 and show that it coincides with where [4+4+ ]Q 4. The matrix X X should equal MSE(X h(x X) 1 X h ) X h = X X = [ 1 0 ]. [ ] 10 10. 10 0 The following are the sample data provided by a moving company on the weights of six shipments, the distances they were moved and the damage that was occured. Weight(1,000 lbs) Distance(1,000 miles) Damage($) i X i1 X i Y i 1 4.0 1.5 160 3.0. 11 3 1.6 1.0 69 4 1..0 90 5 3.0. 1 6 4.0 1.5 186 Assume that the regression model Y i = β 0 + β 1 X i1 + β X i + ɛ i fits the data. Using the method of matrices, obtain the following: 3
Stat 360/ Sample Questions-Final Examination December 004 Page 4 of 7 (i) Vector of estimated Regression Coefficients, i.e. b. (ii) SST O, SSR and SSE. (iii) Estimated Variance Covariance Matrix of b You can use the values of (X X) 1 and (X Y ). as given below: 3.8047 0.38533 1.47616 739.0 (X X) 1 = 0.38533 0.14379 0.00997, (X Y ) = 304.4 1.47616 0.00997 0.86774 18.8 [5+5 ]Q5. (a) Let Y be a random vector with n components and A be a matrix of constants of order m n, using the matrix methods, show that the mean and variancecovariance matrix of W = AY is given by E{W} = AE{Y} σ {W} = Aσ {Y}A (b) Use the above results to prove the following; (i) E{e} = 0. (ii) σ {e} = σ (I H), where H = X(X X) 1 X. [3+4+3 ]Q 6. In a marketing research study the relation between brand liking (Y ) and moisture content (X 1 ) and sweetness (X ) of the product, the following results were obtained, i 1 3 4 5 6 7 8 9 10 11 1 X i1 6 6 6 6 8 8 8 8 10 10 10 10 X i 4 4 4 4 4 4 Y i 7 80 71 83 83 89 86 93 88 95 94 100 (a) The MINITAB output after running a regression of Y on X 1 and X is given below. 4
Stat 360/ Sample Questions-Final Examination December 004 Page 5 of 7 The regression equation is Y = 39. + 4.44 X1 + 3.83 X Predictor Coef StDev T P Constant 39.167 4.735 8.7 0.000 X1 4.4375 0.497 8.9 0.000 X 3.8333 0.810 4.7 0.001 S =.813 R-Sq = 91.9% R-Sq(adj) = 90.1% Source DF SS MS F Regression 806.46 403.3 50.96 Residual Error 9 71.1 7.91 Total 11 877.67 Source DF Seq SS X1 1 630.1 X 1 176.33 Answer the following questions based on the above output. (i) Use extra SS principle to test if X should be retained in the model at α = 0.10. (ii) Perform a lack of fit test for the appropriateness of the two predictor first order linear model (use α =.05. (b) After running the two variable regression above, the researcher thought that it might be better to incorporate the interaction term in the model and thus fitted a regression of Y on X 1, X and X 1 X and produce the following ANOVA table; Source DF SS MS F Regression 3 8.13 74.04 39.47 Residual Error 8 55.54 6.94 Total 11 877.67 5
Stat 360/ Sample Questions-Final Examination December 004 Page 6 of 7 Use Extra SS principle to determine if the term X 1 X should also be included given that X 1 and X are already in the model. Use a level of significance α = 0.05. [+5+3 ]Q7. The following MINITAB output is obtained using the simple linear regression of GPA data, relating the GPA at graduation (Y) to the admission test score (X); Regression Analysis The regression equation is Y(GPA) = - 1.70 + 0.840 X(Test) Predictor Coef StDev T P Constant -1.6996 0.768 -.34 0.031 X(Test) 0.8399 0.1440 5.83 0.000 S = 0.4350 R-Sq = 65.4% R-Sq(adj) = 63.5% Source DF SS MS F P Regression 1 6.4337 6.4337 34.00 0.000 Residual Error 18 3.4063 0.189 Total 19 9.8400 Use this output to answer the following problems (a) Give a point predction of the GPA for a test score of 5.6. (b) Give a 95% Confidence Interval for the mean GPA for a typical student with entrance test score of 5.6. (c) A new student with a entrance test score of 5.6 is admitted. What is the range of GPA which could be obtained by this student with a coefficient of 0.95. 6
Stat 360/ Sample Questions-Final Examination December 004 Page 7 of 7 [5+5 ]Q 8. Use the the data and output of the Q 6.(a) to answer the following questions assuming that the two predictor first order linear model is appropriate. Use.83333 0.5000 0.5000 (X X) 1 = 0.5000 0.0315 0.00000 0.5000 0.00000 0.08333 (a.) Find joint confidence intervals for E{Y h } for X h1 = 8, X h = and for X h1 = 8, X h = 4 with joint confidence coefficient=.90, using Bonferroni and Working -Hotelling approaches. Which of these will be preferred. (b) If two new observations for Y obtained at the above two prescribed levels of X 1 and X are 80 and 90. Would you have 90% confidence in claiming that these observations are generated from the first order model estimated in Q 6(a)? 7