Matrix Essentials review: ) Matrix: Rectangular array of numbers. ) ranspose: Rows become columns and vice-versa ) single row or column is called a row or column) Vector ) R ddition and subtraction: element by element, and dimensions must match. Examples: ) Multiplication: ) Row times column must have same number of entries). Multiply corresponding elements together then add up these cross products Example: ) 9 ) ) )) )) R ) Matrix times matrix: Number of columns in first matrix must equal the number of rows in the second. n other words each row of the first matrix must have the same number of entries as each column of the second. We cannot compute or. We can compute ) and. he entry in row i column j of the product matrix is row i of the first matrix times column j of the second. 9 6 ) ) )) )) )
Exercise: 9 Does? More examples: ) ) therefore is called an identity matrix. ) Division With numbers, to divide by. we notice that.) is. Now /. is the solution X to.x so.)x) or )X. he secret of dividing by. is to multiply by the inverse of. because then our equation becomes X. We have seen that the matrix is acting like the number and both are called the identity, for numbers in one case and matrices for another because any number times is that same number and any matrix times is that matrix. Furthermore for of the appropriate dimension) whereas in general is not the same as. he inverse of a number is another number which, when multiplied by the original number gives the identity so is the inverse of. because.). o have an inverse, a matrix must at least be square same number of rows as columns). he inverse of a square matrix is another square matrix - which, when multiplied by, produces an identity matrix as the product. Examples: Find the inverse of nswer: Not possible No inverse exists for this matrix because one of the columns is an exact linear combination of the other two. Whenever this happens, the matrix cannot be inverted. Such a matrix is said to not be of full rank and is called a singular matrix. Letting the columns be, and we have.. 6....
Examples: Find the inverse of D nswer:.6......9. D Note the notation for inverse. ecause it has no dependent columns, that is, no columns that can be written as linear combinations of others, this matrix D is said to be of full rank or equivalently it is said to be nonsingular. have not shown you how to compute the inverse. SS will do that. You should however, be able to show that the claimed inverse really is the inverse. Remember, the idea of a number s inverse is another number whose product with the original number is the identity. ecause.)) we see that is the inverse of. and. is the inverse of. We see that is the inverse of. etc. Multiply to verify that the identity is the product..6......9. For example,.))-.9))-.)-).-.9. and.))-.9)-)-.))..-. are the first entries in the first row of the product matrix. LES SQURES: We want to fit a line to a set of points X,Y),),,),,), and,). hat is, we want to find b and b such that the column of residuals R, each of whose elements is Y-b b X, has the smallest sum of squares where the column of Y values is the transpose of,,, ), the column of X values is the transpose of,,, ). For example, if b and b we have residuals -)), )), -)) -, and )) -, that is, the column of residuals is R with sum of squares ) 9 R R. We can probably do better. Notice that if X and Y then Xb Y R is our column of residuals and in general, b b Xb Y R is the column of residuals for any b, b ) pair we pick so that finding the b and b that minimize R R will give an intercept and slope
that cannot be beaten in terms of minimizing the error sum of squares). We thus compute R R Y Xb) Y Xb) and set its derivatives with respect to b and b equal to. his results in a matrix equation whose solution is the vector b b,b ) of desired estimates *********************** X Xb X Y ************************ which is the reason for all the matrix algebra we have seen thus far. Notice that if we can invert X X then we can solve for b and if we have a program that can invert matrices accurately then it does not matter how many observations or explanatory variables we have, we can get the least squares estimates! he important equation *********************** X Xb X Y ************************ is called the normal equations plural because the matrix equation has several rows) and again if we can invert the X X matrix, the vector of least squares solutions b is given by *********************** b X X ) X Y ************************ Let s try to get the unbeatable in terms of least squares) intercept and slope for our points: 6 X X X 6 X X ) 6 6.9.. /6 you can check this) 6.9. 6 9. X Y so our solution is b 9. /6 9. Our predicting equation is Y predicted 9..X whose sum of squared residuals.6 is the smallest possible and way better than the 9 we got using Y predicted X. Suppose we try computing the residual sum of squares across a grid of values and plotting the result: D OLS; DO D - O Y.;
DO D -. O. Y.; 9.D;.D; SSE MN --*)** --*)** --*)** --*)**, 6 ) ; OUPU; END; END; PRO GD; PLO *SSE/ROE; RUN; PRO GONOUR; PLO *SSE; RUN; Why do we want to use least squares in the first place? What is so good about that method? he answer is that if the errors are independent and normally distributed with constant variance then the least squares estimated intercept and slope will vary around the true values in repeated samples, be normally distributed in repeated samples, and will have the smallest possible variation in repeated samples. n practice, we use a computer program to do the computations above. For example, ods html close; ods listing; ods listing gpath"%sysfuncpathnamework))"; OPONS LS6; D OLS; NPU X Y @@; RDS; ; PRO REG DOLS; MODEL YX / XPX OV; OUPU OUOU PREDEDP RESDULR; RUN; PRO SGPLO DOU; SER XX YY;
SERES XX YP; RUN; ) he by matrix X X is in the top left, bordered by X Y to the right. Y Y is in the lower right corner. he REG Procedure Model: MODEL Model rossproducts X'X X'Y Y'Y Variable ntercept X Y ntercept 6 6 X 6 9 Y 6 9 9 ) he matrix X X is inverted and bordered by b to the right. SSerror) is in the lower right corner. X'X nverse, Parameter Estimates, and SSE Variable ntercept X Y ntercept.9 -. 9. X -..6.69 Y 9..69.66 nalysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model...6. Error.66.9 orrected otal 6.
) Parameter estimates are given with standard errors. Each t statistic is the ratio of the estimate to its standard error. he slope is significantly different than p. is less than.). Parameter Estimates Parameter Standard Variable DF Estimate Error t Value Pr > t ntercept 9..9.. X..6 6.6. Where did those standard errors come from? First, multiply each entry in X X) - by MSE, the mean squared error.9) which is an estimate of the variance σ of the errors e. he resulting matrix is called the covariance matrix of the parameter estimates and is shown below as a result of the OV option. he negative number -.9 is the covariance between the slope and intercept in repeated samples. he elements on the diagonal from upper left to lower right) are estimated variances of the intercept and slopes as many slopes as you have predictor variables just in this example). he square roots of these numbers are the standard errors. For example.9 was the standard error of the intercept. his.9 is the square root of.6. ovariance of Estimates Variable ntercept X ntercept.669 -.9 X -.9.96