EXST7034 - Regression Techniques Page 1 SIMPLE LINEAR REGRESSION WITH MATRIX ALGEBRA MODEL: Y 3 = "! + "" X 3 + % 3 MATRIX MODEL: Y = XB + E Ô Y" Ô 1 X" Ô e" Y# 1 X# b! e# or Ö Ù = Ö Ù Ö Ù b ã ã ã " ã ÕY Ø Õ1 X Ø Õe Ø 8 8 8 Where, Y is the vector of dependent variables XB is the linear model ith data values and parameter estimates in separate matrices. the "1"s correct for the intercept E is the vector of random deviations or random errors THE VALUES NEEDED FOR THE MATRIX SOLUTION ARE: Y Y hich is equivalent to USSY and equal to Y3 Y Y = cy Y... Y " # 8 d Ô Y" Y# Ö Ù ã ÕY Ø 8 X X hich produces various intermediate sums XX = = Ô 1 X" 1 1... 1 1 X# X" X #... X Ö Ù 8 ã ã Õ1 X Ø n DX3 DX DX 3 3 8
EXST7034 - Regression Techniques Page X Y hich produces sum of Y and cross products Ô Y" 1 1... 1 Y# DY3 X Y = X X = = X" X #... X Ö Ù 8 ã DX3Y 3 ÕY Ø 8 Using these three intermediate matrices (Y Y, X X and X Y) e can proceed ith the matrix solutions. The least squares solution for linear regression is based on the solutions of normal equations. The normal equations for a Simple linear Regression are nb! + DX3 b " = DY3 x DX b + b = DX Y 3! 3 " 3 3 hich can be expressed as a matrix equation n DX3 b Y = X X! b D 3 D D DX Y 3 3 " 3 3
EXST7034 - Regression Techniques Page 3 from our previous calculations e can recognize this matrix equation as (XX) B = XY here B is the vector of regression coefficients. In order to obtain the fitted model e must solve for B. -" -" (X X) (X X) B = (X X) (X Y) -" I*B = (X X) (X Y) -" B = (X X) (X Y) here A = a -a "" ## "# #" -a a -" 1 ## "# a a - a a #" "" and then (X X) = -" 1 3 3 ndx - ( X ) 3 DX -DX D 3 3 -DX n DX DX 3 3 Ô ndx ( DX ) ndx ( DX ) Y b 3 3 3 B = * = DX n Õ Ø D 3 3 XY! b 3 D 3 3 " ndx ( DX ) ndx ( DX ) 3 3 3 3
EXST7034 - Regression Techniques Page 4 once e have fitted the line using the matrix equation, e proceed to obtain the Analysis of Variance table and our tests of hypothesis. ANOVA TABLE in matrix form uncorrected corrected Source d.f. USS d.f. SS Regression B X Y 1 B X Y - CF Error n- Y Y-B X Y n- Y Y-B X Y Total n Y Y n-1 Y Y-CF The correction factor (CF) is the same as has been previously discussed. It ( Y ) can be calculated as either 3œ" 3 or ny Additional calculations are BXY-CF YY - CF R = the F test and t-test are given by F = = D 8 n MSRegression (B X Y CF)/dfReg MSError (YY BXY)/dfError The Variance - covariance matrix can be calculated as -" c!! c!" MSE*c!! MSE*c!" MSE * (X X) = MSE* = c c MSE*c MSE*c "! "" "! "" Sb S S S! bb! " b! b!" = or S S S S bb b "! " "! " b b
EXST7034 - Regression Techniques Page 5 REVIEW OF SIMPLE LINEAR REGRESSION WITH MATRIX ALGEBRA -" first obtain Y Y, X X, X Y, and (X X) then -" B = (X X) (X Y) SSReg = B X Y - CF SSTotal = Y Y - CF SSE = SSTotal - SSReg = Y Y - B X Y Y = Y = LB L L -" Y x S = L(XX) L(MSE) -" -" Y x S = [1 + L(X X) L ](MSE) = L(X X) L (MSE) + MSE -" Variance-Covariance Matrix = (X X) (MSE)
EXST7034 - Regression Techniques Page 6 Prediction of the true mean (population mean) and of ne observations. SLR a) Predicting the true population mean at x o Model: Y 3 = "! + "" X 3 + % 3 The true population mean at x o is: "! + "" xo The predicted value (Y ) is: b + b x = Y + b (x - X) Variance (error) for Y is: + o! " o " o 1 (xo-x) o 5 n S XX 1 (x -X)! " o! o, n- 5 n S Confidence Interval: (b + b x ) t Ê + # b) Predicting a ne observation at x o The true observation x is: Y = "! + "" x + % The predicted value (Y ) is: b + b x o o o o o o! " 1 n o (x -X) S o Variance (error) for Y is: 5 1 + + Confidence Interval: (b + b x ) t (x -X)! " o! o, n- Ê 5 1 + n + # S MLR - matrix algebra generalization a) Predicting the true population mean Model: Y 3 = X " + % The true population mean at x is: The predicted value is: Variance (error) is: XB X o " 5 X(X X) -" X Confidence Interval: x b t! 5 X(X X) X b) Predicting a observation! XX " #, n-p - The true mean at x o is: Y o = X!" + % o The predicted value is: x! b Variance (error) is: 5 [1 + X(X X) X ] 1 Confidence Interval: x b t! É5 [1 + X(X X) -" X ]! #, n-p XX XX
EXST7034 - Regression Techniques Page 7 More ANOVA table stuff We have seen the partitioning of Y into components 3 SSTotal œ SSCF SSRegression SSError ANOVA TABLE for SLR SOURCE d.f. SS Mean 1 ny Regression 1 (Y D 3 - Y) Error n- (Y - Y ) D 3 3 Total (uncorrected) n DY 3 SS(Mean) = SS(b!) SS R(b!) SS(Regression) = SS(b " b!) SS R(b " b!) SS(Error) = SSE = SSResiduals SS (b,b ) = b X Y = combined sum of squares for SSMean and SSReg R! " The usual ANOVA TABLE for SLR SOURCE d.f. SS SS (b b ) 1 (Y D - Y) MSReg (Y œ D - Y) / 1 R "! 3 3 Error n- D(Y - Y ) MSError (Y - Y 3 3 œ D 3 3) / n- Total Corrected n-1 D(Y - Y) S œ D(Y - Y) / n-1 3 3 Y 3
EXST7034 - Regression Techniques Page 8 Multiple Regression Y œ " " X " X % here E( % 3 ) œ 0 3! " "3 # #3 3 then E(Y ) œ " " X " X 3! " "3 # #3 This model then predicts any point on a PLANE, here the axes of the plane are X and X. The response variable, Y, gives the height above the plane. " # 3 If e choose any particular value of X " or X #, then e essentially take a slice of the plane. Y 3 œ b! bx " "3 bx # #3 hold X constant Y œ b b X b C Y œ (b b C ) b X # 3! " "3 # \# 3! # \# " "3 Lets suppose e take that slice at a point here X " =10 for the folloing model. We are then holding X " constant at 10, and examining the line (a slice of the plane) at that point. Y 3 œ 5 X "3 3X #3 Y 3 œ 5 *10 3X #3 Y œ 5 3X 3 #3 This is a simple linear function. At every particular value of either X or X, the " # function for the other X ill be simple linear for this model. 5 NOTE that the interpretation for the regression coefficients is the same as before, except that no e have one regression coefficient per independent variable. General Linear Regressions Y œ " " X " X " X... " X % 3! " "3 # #3 $ $3 :-" :-"3 3 this is no longer a simple plane; it describes a hyperplane. Hoever, e could still hold all X 5 constant except one, and describe a simple linear function.
EXST7034 - Regression Techniques Page 9 Calculations for Multiple Regression Y = b + b X + b X +... + b X + e 3! " "3 # #3 5 53 3 here there are k different independent variables 1) as before, the equation can be solved for e, and partial derivatives taken 3 ith respect to each unknon (b, b, b, ).! " # á ) the partial derivatives can be set equal to zero, and solved simultaneously to get k+1 equations ith k+1 unknons. 3) the normal equations derived are; nb! + DX" b " + DX# b #... DX5b 5 = DY3 DX b + DX" b + DX X b... DX X b = DX Y DX b + DX X b + DX# b... DX X b = DX Y ã ã ã ã ã ã DX b + DX X b + DX X b... DX b = DX Y "! " " # # " 5 5 " 3 #! " # " # # 5 5 # 3 5! " 5 " # 5 # 5 5 5 3 hich can be factored out to an equation of matrices Ô n DX" DX# DX5 Ô b! Ô DY3 DX" DX" DXX " # á DXX " 5 b" DX" 3Y3 DX# DX" X# DX# á DX# X5 * b# = DX#3Y3 Ö Ù Ö Ù Ö Ù ã ã ã ã ã Õ DX DX X DX X á DX Ø Õb Ø Õ Ø 5 DX53Y3 5 " 5 # 5 5
EXST7034 - Regression Techniques Page 10 Matrix calculations for General Regression : Numerical Example - NWK7.0 Mathematician salaries. X = Index of publication quality, X =years of " # experience, X =success in getting grant support. $ Ô 1 33. 3.5 9.0 1 40.3 5.3 0.0 1 38.7 5.1 18.0 1 46. 8 5.8 33.0 1 41.4 4. 31.0 1 37.5 6.0 13.0 1 39.0 6.8 5.0 1 40.7 5.5 30.0 1 30.1 3.1 5.0 1 5.9 7. 47.0 1 38. 4.5 5.0 X= 1 31.8 4.9 11.0 1 43.3 8.0 3.0 1 44.1 6.5 35.0 1 4.8 6.6 39.0 1 33.6 3.7 1.0 1 34. 6. 7.0 1 48.0 7.0 40.0 1 38.0 4.0 35.0 1 35.9 4.5 3.0 1 40.4 5.9 33.0 1 36.8 5.6 7.0 Ö Ù 1 45. 4.8 34.0 Õ1 35.1 3.9 15.1 Ø Ô 6.1 6.4 7.4 Ù 6.7 7.5 5.9 6.0 4.0 5.8 8.3 5.0 Y= 6.4 7.4 7.0 5.0 4.4 5.5 7.0 6.0 3.5 4.9 4.3 Ö Ù 8.0 Õ5.0 Ø
EXST7034 - Regression Techniques Page 11 Ra data matrices (X and Y) and the intermediate calculations (X X, X Y & Y Y). 8 8 8 Ô n D X" D X# D X$ 3œ" 3œ" 3œ" 8 8 8 8 D X" D X" D XX " # D XX " $ XX = 3œ" 3œ" 3œ" 3œ" 8 8 8 8 X# X" X# X# X# X D D D D $ Ö3œ" 3œ" 3œ" 3œ" Ù 8 8 Õ D X$ D X" X$ D 8 X# X$ D 8 X$ Ø 3œ" 3œ" 3œ" 3œ" = Ô 4 948 18.6 599 948 38135.6 5188.17 4873.7 Ö Ù 18.6 5188.17 77.44 3365.3 Õ 599 4873.7 3365.3 17847 Ø 8 Ô D Y 3œ" 8 143.5 D XY " Ô 5767.77 X Y = 3œ" = Ö Ù 8 XY 78.49 D # Ö3œ" Ù Õ 3671.9 Ø 8 Õ D XY $ Ø 3œ" Y Y = D 8 Y = 899.49 3œ"
EXST7034 - Regression Techniques Page 1 the normal equations derived are; nb! + DX" b " + DX# b # + DX5b 5 = DY3 DXb "! + DXb " " + DXXb " # # + DXXb " 5 5 = DXY " 3 DXb + DXXb + DXb # + DXXb = DXY DXb + DXXb + DXXb + DXb = DXY #! " # " # # 5 5 # 3 $! " $ " # $ # $ $ $ 3 hich can be factored out to an equation of matrices Ô n DX" DX# DX$ Ô b! Ô DY3 DX" DX" DXX " # DXX " $ b" DX"3Y3 Ö Ù* Ö Ù = Ö Ù DX# DX" X# DX# DX# X$ b# DX#3Y3 Õ DX DX X DX X DX Ø Õb Ø ÕDX Y Ø $ $3 3 $ " $ # $ $
EXST7034 - Regression Techniques Page 13 Analysis starts ith the X X inverse Ô c!! c!" c!# c!$ -" c"! c"" c"# c"$ (X X) = Ö Ù= c#! c#" c## c#$ Õc c c c Ø $! $" $# $$ Ô b! Ô -4.511385879 -" b" 0.3743477585 B = (X X) (X Y) = Ö Ù = Ö Ù b# -0.76494134 Õb Ø Õ -0.11439513 Ø $ 5.3478-0.1958 0.1486 0.0654 Ô -0.1958 0.0084-0.0115-0.00874 Ö 0.1486-0.0115 0.05088 0.00356 Ù Õ Ø 0.06541-0.00874 0.00356 0.0014 Analysis of Variance USSTotal = Y Y USSRegression = B X Y SSE = Y Y B X Y = UCSSTotal UCSSReg The ANOVA table calculated ith matrix formulas is Source d.f. SS Regression p-1=3 B X Y CF =1.41 Error n p-1=0 Y Y B X Y = 17.845 Total n 1=3 Y Y CF = 39.086 ( DY) here the correction factor is calculated as usual, CF = = ny. The F test of the model is a joint test of the regression coefficients, H!: "" = "# = " $ =... = ": " = 0 MSRegression (B X Y CF)/dfReg MSError (YY BXY)/dfError F = = here E(MSR)= + 5 (X X ) + (X X " D " D ) " " D(X X )(X X " "3 " # #3 # " # "3 " #3 #) NOTE: E(MSR) departs from 5 as " 5 increases in magnitude (+ of ) or as any X 53 increases in distance from X 5. The F test is a joint test for all " 5 jointly equal 0. n
EXST7034 - Regression Techniques Page 14 (b 5 0) " 5 s To test any individually, e can still use t = here s b 5 is obtained from the VARIANCE (belo). The confidence interval, for any, is given by " 5 P(b t! s Ÿ " Ÿ b t! s ) œ 1-! 5 " ß8 : b 5 5 " ß8 : # 5 # 5 b b 5 COVARIANCE matrix and the Bonferroni joint confidence interval for several " 5 parameters is given by P(b5 t! " ß8 : s Ÿ 5 Ÿ b 5 t! " ß8 : s ) # g b " 5 # g b 5 œ 1-! here g" is the number of parameters -" The VARIANCE COVARIANCE matrix is calculated as from the X X matrix. Ô c!! c!" c!# c!$ -" c"! c"" c"# c"$ (X X) = MSEÖ Ù c#! c#" c## c#$ Õc$! c$" c$# c Ø $$ Ô MSE*c MSE*c MSE*c MSE*c Ô s s s s!!!"!#!$ MSE*c MSE*c MSE*c MSE*c "! "" "# "$ = Ö Ù= s s, s s "" MSE*c MSE*c MSE*c MSE*c Ö #! #" ## #$ s s s, s ## ÕMSE*c MSE*c MSE*c MSE*c Ø $! $" $# $$ Õ s s s s Ô Var(b ") Cov(b" b #) á Cov(b" b n) Cov(b# b ") Var(b #) á Cov(b# b n) = Ö Ù = Var-Cov matrix ã ÕCov(b b ) Cov(b b ) á Var(b ) Ø n " n # n,,,,,,,!!!"!#!$ "! "# "$,,,,,,, #! #" #$ $! $" $# $$ here the c 34 values are called Gaussian multipliers. The VARIANCE COVARIANCE matrix is then calculated from this matrix by multiplying by the MSError. Ù Ø These are unbiased estimates of 5, The individual values then provide the variances and covariances such that MSE*c!! = Variance of b! = VAR(b!) MSE*c "" = Variance of b " = VAR(b "), so s b " = ÈMSE*c"" MSE*c!" = MSE*c "! = Covariance of b! and b " = COV(b!,b")
EXST7034 - Regression Techniques Page 15 Prediction of mean response For simple linear regression e got Y and its CI for some X For multiple regression, e need an X for each X 4 Ô 1 X" Given a vector of X = X# Ö Ù ã ÕX Ø E(Y ) = X " Y = XB ß: -" The variance estimates for mean responses are given by -" MSE*X(XX) X for individual observations, add one MSE -" MSE+MSE*X(X X) X P(Y t s E(Y ) Y! Ÿ Ÿ t! s ) œ 1- " # ß8 p Y " # ß8 p Y! simultaneous estimates of several mean responses can employ either the Working-Hotelling approach, "! à:ß8 : Y Ws here W = pf Y or the Bonferroni approach, "! à8 : Y Bs here B = t Y # g
EXST7034 - Regression Techniques Page 16 for individual observations the prediction is the same Y = X B and the variance is one MSE larger than for the mean -" MSE + MSE*X(X X) X and for the mean of a ne sample of size m, the variance is MSE m -" + MSE*X(X X) X As ith the SLR, confidence intervals of g ne observations can be done ith Scheffe limits Y Ss, here S = gf "! à g ß8 : Y or the Bonferroni approach, "! à8 : Y Bs here B = t Y # g
EXST7034 - Regression Techniques Page 17 Coefficient of Multiple Determination - the proportion of the SSTotal (usually corrected) accounted for by the Regression line (SSReg). Models ith an intercept SS (b,b, á, b b ) SSRegression BXY CF SS - SS SS S SSTotal (corrected) YY CF S S R " # k! R = = = = = 1- YY E E YY YY YY 1.4 = 39.09 = 0.5434 Recalling the expressions for S e have R bxy - ny YY - ny D(Y - Y) D(Y 3 - Y) 3 R = = = 1 - SSE SSTotal (UNCorrected) Some Properties of R : Same as for SLR 1) 0 Ÿ R Ÿ 1 ) R = 1.0iff Y = Y for all i (perfect prediction, SSE = 0) 3 3 3) R = r XY for simple linear regression YY 4) R = r for all models ith intercepts 5) R Á 1.0hen there are different repeated values of Y 3 at some value of X 3 (no matter ho ell the model fits) SubModel FullModel 6) R Ÿ R Ne independent variables added to a model ill increase R. The R for the full model could be EQUAL, but never less than the R for the submodel F test for Lack of Fit E(Y) = " " X " X " X +... " - X -! " "3 # #3 $ $3 : " : "ß3 To get true repeats in multiple regression, EVERY independent variable must remain the same from one observation to another This can be calculated ith either full and reduced model
EXST7034 - Regression Techniques Page 18 Ne problems associated ith MULTIPLE REGRESSION The SLR fitted only a single slope, so e needed only one SSReg to describe it. With various slopes, e ill need some other sums of squares to describe the various fitted slopes. e ill actually see types of SS non-problems associated ith Multiple regression a) All previous definitions and notation apply. b) The assumptions are basically the same (more X, each measured ithout 3 error). Many of the tests of hypothesis ill be discussed in terms of the General Linear Test, ith appropriate Full and Reduced models. This test does not really change ith multiple regression, We still have the same table, Model d.f SS MS F Reduced (Error) n-p SSE Red Full (Error) n-p-q SSE Full Difference q SS Diff MS Diff Full (Error) n-p-q SSE Full MSE Full MS MSE Diff Full Model d.f SS MS F Full (SSReg) p+q SSR Red Reduced (SSReg) p SSR Full Difference q SS Diff MS Diff Full (Error) n-p-q SSE Full MSE Full MS MSE Diff Full