Simple Linear Regression 35 Problems 1 Consider a set of data (x i, y i ), i =1, 2,,n, and the following two regression models: y i = β 0 + β 1 x i + ε, (i =1, 2,,n), Model A y i = γ 0 + γ 1 x i + γ 2 x 2 i + ε, (i =1, 2,,n), Model B Suppose both models are fitted to the same data Show that SS Res, A SS Res, B If more higher order terms are added into the above Model B, ie, y i = γ 0 + γ 1 x i + γ 2 x 2 i + γ 3 x 3 i + + γ k x k i + ε, (i =1, 2,,n), show that the inequality SS Res, A SS Res, B still holds 2 Consider the zero intercept model given by y i = β 1 x i + ε i, (i =1, 2,,n) where the ε i s are independent normal variables with constant variance σ 2 Show that the 100(1 α)% confidence interval on E(y x 0 )isgiven by x 2 0 b 1 x 0 + t α/2, n 1 s n x2 i where s = n (y i b 1 x i )/(n 1) and b 1 = n y ix i n x2 i 3 Derive and discuss the (1 α)100% confidence interval on the slope β 1 for the simple linear model with zero intercept 4 Consider the fixed zero intercept regression model y i = β 1 x i + ε i, (i =1, 2,,n) The appropriate estimator of σ 2 is given by n s 2 (y i ŷ i ) 2 = n 1 Show that s 2 is an unbiased estimator of σ 2
36 Linear Regression Analysis: Theory and Computing Table 210 Data for Two Parallel Regression Lines x y x 1 y 1 x n1 y n1 x n1 +1 y n1 +1 x n1 +n 2 y n1 +n 2 5 Consider a situation in which the regression data set is divided into two parts as shown in Table 210 The regression model is given by β (1) 0 + β 1 x i + ε i, i =1, 2,,n 1 ; y i = β (2) 0 + β 1 x i + ε i, i = n 1 +1,,n 1 + n 2 In other words, there are two regression lines with common slope Using the centered regression model y i = β (1 ) 0 + β 1 (x i x 1 )+ε i, i =1, 2,,n 1 ; β (2 ) 0 + β 1 (x i x 2 )+ε i, i = n 1 +1,,n 1 + n 2, where x 1 = n 1 x i/n 1 and x 2 = n 1+n 2 i=n x 1+1 i/n 2 Show that the least squares estimate of β 1 is given by n1 b 1 = (x i x 1 )y i + n 1+n 2 i=n 1 +1 (x i x 2 )y i n1 (x i x 1 ) 2 + n 1 +n 2 i=n (x 1+1 i x 2 ) 2 6 Consider two simple linear models and Y 1j = α 1 + β 1 x 1j + ε 1j,j=1, 2,,n 1 Y 2j = α 2 + β 2 x 2j + ε 2j,j=1, 2,,n 2 Assume that β 1 β 2 the above two simple linear models intersect Let x 0 be the point on the x-axis at which the two linear models intersect Also assume that ε ij are independent normal variable with a variance σ 2 Show that
Simple Linear Regression 37 (a) x 0 = α 1 α 2 β 1 β 2 (b) Find the maximum likelihood estimates (MLE) of x 0 using the least squares estimators ˆα 1,ˆα 2, ˆβ 1,and ˆβ 2 (c) Show that the distribution of Z, where Z =(ˆα 1 ˆα 2 )+x 0 ( ˆβ 1 ˆβ 2 ), is the normal distribution with mean 0 and variance A 2 σ 2,where x 2 A 2 1j 2x 0 x1j + x 2 0n 1 x 2 = 2j 2x 0 x2j + x 2 0n 2 n 1 (x1j x 1 ) 2 + n 2 (x2j x 2 ) 2 (d) Show that U = N ˆσ 2 /σ 2 is distributed as χ 2 (N), where N = n 1 + n 2 4 (e) Show that U and Z are independent (f) Show that W = Z 2 /A 2ˆσ 2 has the F distribution with degrees of freedom 1 and N (g) Let S 2 1 = (x 1j x 1 ) 2 and S 2 2 = (x 2j x 2 ) 2, show that the solution of the following quadratic equation about x 0, q(x 0 )= ax 2 0 +2bx 0 + c =0, ( ˆβ 1 ˆβ 2 ) 2 ( 1 S 2 1 + 1 ] )ˆσ 2 S2 2 F α,1,n x 2 0 +2 (ˆα 1 ˆα 2 )( ˆβ 1 ˆβ 2 )+ ( x1 + x 2 S 2 1 S 2 2 )ˆσ 2 F α,1,n ] x 0 ( + (ˆα 1 ˆα 2 ) 2 x 2 1j x 2 ] 2j n 1 S1 2 + )ˆσ 2 n 2 S2 2 F α,1,n =0 Show that if a 0andb 2 ac 0, then 1 α confidence interval on x 0 is b b 2 ac x 0 b + b 2 ac a a 7 Observations on the yield of a chemical reaction taken at various temperatures were recorded in Table 211: (a) Fit a simple linear regression and estimate β 0 and β 1 using the least squares method (b) Compute 95% confidence intervals on E(y x) at 4 levels of temperatures in the data Plot the upper and lower confidence intervals around the regression line
38 Linear Regression Analysis: Theory and Computing Table 211 Chemical Reaction Data temperature (C 0 ) yield of chemical reaction (%) Data Source: Raymond H Myers, Classical and Modern Regression Analysis With Applications, P77 (c) Plot a 95% confidence band on the regression line Plot on the same graph for part (b) and comment on it 8 The study Development of LIFETEST, a Dynamic Technique to Assess Individual Capability to Lift Material was conducted in Virginia Polytechnic Institute and State University in 1982 to determine if certain static arm strength measures have influence on the dynamic lift characteristics of individual 25 individuals were subjected to strength tests and then were asked to perform a weight-lifting test in which weight was dynamically lifted overhead The data are in Table 212: (a) Find the linear regression line using the least squares method (b) Define the joint hypothesis H 0 : β 0 =0, β 1 =22 Test this hypothesis problem using a 95% joint confidence region and β 0 and β 1 to draw your conclusion (c) Calculate the studentized residuals for the regression model Plot the studentized residuals against x and comment on the plot
Simple Linear Regression 39 Table 212 Weight-lifting Test Data Individual Arm Strength (x) Dynamic Lift (y) 1 173 714 2 195 483 3 195 883 4 197 750 5 229 917 6 231 1000 7 264 733 8 268 650 9 276 750 10 281 883 11 281 683 12 287 967 13 290 767 14 296 783 15 299 600 16 299 717 17 303 850 18 313 850 19 360 883 20 395 1000 21 404 1000 22 443 1000 23 446 917 24 504 1000 25 559 717 Data Source: Raymond H Myers, Classical and Modern Regression Analysis With Applications, P76