Lecture 9 SLR in Matrix Form STAT 51 Spring 011 Background Reading KNNL: Chapter 5 9-1
Topic Overview Matrix Equations for SLR Don t focus so much on the matrix arithmetic as on the form of the equations. Try to understand how the different pieces fit together. 9-
SLR Model in Scalar Form iid ~ 0, i N Yi 0 1Xi i where i 1,,..., n Consider now writing an equation for each obsn: Y Y Y X 1 0 1 1 1 X 0 1 X n 0 1 n n 9-3
SLR Model in Matrix Form Y 1 0 1X 1 Y 1 1 1 X1 1 Y 0 1X Y 1 X 0 1 Y n 0 1 1 n n X n Y n X n Y X n1 n β ε 1 n1 9-4
Terminology Y is the response vector X is called the design matrix β is the vector of parameters ε is the error vector 9-5
Covariance Matrix for ε σ ε 0 0 1 0 0 I nn Cov n 0 0 9-6
Covariance Matrix for Y σ Y Cov Y Y Y 1 n I n n 9-7
Assumptions in Matrix Form ε ~ N 0, I 0 is the n 1 zero vector; I is the n n identity matrix. Ones in the diagonal elements specify that the variance of each i is. Zeros in the off-diagonal elements specify that the covariance between i and j is zero for i j. Implies zero correlation. 9-8
Least Squares Estimation Errors are ε YXβ. Want to minimize sum of squared errors: 1 i 1 n εε n That is, we want to minimize: εε Y - Xβ Y - Xβ 9-9
Least Squares () We take the derivative with respect to the vector β This is like a quadratic function: think Y Xβ. Using the chain rule, this derivative works out to d d Y X Y X X Y X 9-10
Least Squares (3) We set this equal to 0 (a vector of zeros) and solve for β, resulting in the normal equations: XY XX β Solving this equation for β gives the least squares b0 solution for 1 b b X X X Y. 1 9-11
Least Squares Solution -1 b = X X XY (REMEMBER THIS!!!) 9-1
Reality break This is just to convince you that we have done nothing new or magical all we are doing is writing the same old formulas for b 0 and b 1 in matrix format. Do NOT worry if you cannot reproduce the following matrix algebra, but you SHOULD try to follow it so that you believe me that this is really not a new formula. 9-13
Recall from previous topics we had: b i i XY 1 X SS i X X b Y b X 0 1 X X Y Y SS For the matrix format, we begin by noting: 1 X1 1 X n X XX 1 X n 1 1 1 i X 1 X X n Xi Xi 9-14
9-15 1 1 1 i i i i i i i X i X X X n n X X X X nss X n XX 1 1 1 1 1 i n i i n Y Y Y X X X XY Y XY Plugging these into the equation for b yields:
1 b XX XY X i Xi Yi i i i 1 nss X X n XY i i i 1 X Y X X Y i i i i i nss X X Y n X Y i i i i 1 Y X XXY SSX XY i i nxy 1 Y X Y nx X nxy X X Y i i i SS SS X XY 1 YSSX SSXYX SSXY Y X b0 SS X SS X SS b XY 1 SSXY SS X 9-16
SLR in Matrices All we have done is to write the same old formulas for b 0 and b 1 in a fancy new format. Why have we bothered to do this? The cool part is that the same approach works for multiple regression. All we do is make X and b into bigger matrices, and use exactly the same formula. 9-17
Fitted Values The fitted (or predicted) values are: Yˆ Ŷ b bx 1 X 1 0 1 1 1 Yˆ b 0 b1x 1 X b0 b 1 Ŷ b0 b1x n 1 X n n Xb 9-18
Hat Matrix Yˆ Xb ˆ 1 Y X( XX) XY ˆ 1 Y HY where H X( XX) X Involved in standard errors Plays large role in diagnostics and the identification of influential observations 9-19
Hat Matrix Properties Hat matrix is symmetric Hat matrix is idempotent since HH H 9-0
Residuals Formula: e Y Xb I HY Only Y is random. Var Y Variance Covariance Matrix for e is the n n matrix: Var e I H I I H Estimated covariance matrix for e is (also an n n matrix): s e MSE I H I 9-1
Other Variances / Covariances The vector b is a linear combination of the elements of Y. Since Y is normal, b must also be normal. Even if normality is violated, these estimates are robust and will be approximately normal in general. To determine the var-cov matrix we need to use the following theorem: 9-
Useful MultivariateTheorem Suppose U ~ N μσ,, a multivariate normal vector, and V = c + DU, a linear transformation of U where c is a vector and D is a matrix. μ Σ. Then V ~ N c D, D D 9-3
Covariance Matrix for b -1-1 b = X X X Y X X X Y Y ~ N Xβ, I. Recall: and Now apply theorem to b using U = Y, μ = Xβ, Σ I V b, c 0 and D XX X 1 9-4
Covariance Matrix for b () The theorem tells us the vector b is normally distributed with mean 1 XX XX β β and covariance matrix XX X I XX X XX... 1 1 1 Note: Use basic results from section 5.7 and the fact that both XX and its inverse are symmetric to show this equality. 9-5
Other Variances in Matrix Form 1 ˆh X h X X X h s Y MSE ˆ, 1 1 s Yh new MSE Xh X X X h Note: X h is the row vector [1 X h ]. 9-6
Sums of Squares Sums of squares can also be written in terms of matrices. Need a special matrix J which is square and every entry is 1. Results shown in Section 5.1. 9-7
Sum of Squares Formulas SSR 1 b X Y n Y JY SSE ee YY bxy SSTO 1 Y Y n Y JY 9-8
Sums of Squares () Can also be written as quadratic forms SSR Y H 1 n J Y SSE YI HY SSTO Y I 1 n J Y 9-9
Computations As usual we will continue to use software to do computations, so you need not worry about having to multiply big matrices. Thinking in matrix form will come in handy when we begin talking about multiple regression, because all one needs to do is extend the dimensions of the matrices. 9-30
Upcoming in Lecture 10... Introduction to Multiple Regression Background Reading: o KNNL 6.1-6.3 9-31