Biostat 140655, 2008: Matrix Algebra Review 1 Definition: An m n matrix, A m n, is a rectangular array of real numbers with m rows and n columns Element in the i th row and the j th column is denoted by a ij a 11 a 12 a 1n a 21 a 21 a 2n A m n = a m1 a mn m n Definition: A vector a of length n is an n 1 matrix with each element denoted by a i The i th element is called the i th component of the vector and n is the dimensionality Matrix Operations a 1 a 2 a = a n 1 Two matrices A and B of the same dimensions can be added The sum A + B has (i, j) entry a ij + b ij So a11 a 12 b11 b + 12 a11 + b = 11 a 12 + b 12 a 21 a 22 b 21 b 22 a 21 + b 21 a 22 + b 22 1 3 3 2 4 5 Example : 2 6 B = 9 5 A + B = 11 11 1 3 3 0 2 3 A + B = B + A (A + B) + C = A + (B + C) 2 A matrix may also be multiplied by a constant c The product ca is the matrix that results from multiplying each element of A by c Thus ca 11 ca 12 ca 1n ca 21 ca 21 ca 2n ca m n = ca m1 ca mn m n
Biostat 140655, 2008: Matrix Algebra Review 2 3 The transpose operation A T or A of a matrix changes the columns into rows so that the first column of A becomes the first row of A T, the second column becomes second row, and etc So the (i,j) th element in A m n becomes the (j,i) th in the transpose A T n m (A T ) T = A (A + B ) T = A T + B T (A B ) T = B T A T (c A ) T = c A T 1 2 3 4 5 6 2 3 1 4 A T = 2 5 3 6 4 We can define matrix multiplication A B if the number of elements in a row of A is the same as the number of elements in the columns of B Eg when A is (p k) and B is (k n) An element of the new matrix AB is formed by taking the inner product of each row of A with each column of B The matrix product AB is A m n B n m = C m n with c ij = 3 2 n a ik + b kj a11 a 12 b11 b 12 a11 b = 11 + a 12 b 21 a 11 b 12 + a 12 b 22 a 21 a 22 b 21 b 22 a 21 b 11 + a 22 b 21 a 21 b 12 + a 22 b 22 (AB)C = A(BC) A(B+C) = AB+AC (A+B)C = AC+BC k=1 1 4 1 2 3 A T = 2 5 4 5 6 2 3 3 6 3 2 1(1) + 2(2) + 3(3) 1(4) + 2(5) + 3(6) A A T = 4(1) + 5(2) + 6(3) 4(4) + 5(5) + 6(6) AB does not generally equal to BA! Special Square Matrix 1 A square matrix A is said to be symmetric if A T or a ij = a ji for all i and j 3 5 3 5 is symmetric; is not symmetric 5 2 6 2
Biostat 140655, 2008: Matrix Algebra Review 3 2 Diagonal Matrix: A n n = diag(a 1,, a n ) = a 1 0 0 0 a 2 0 0 0 0 a n 3 Identity Matrix: I n n = diag(1 n ) = 1 0 0 0 1 0 0 0 0 1 The identity matrix is a square matrix with ones on the diagonal and zeros elsewhere It follows from the definition of matrix multiplication that the (i, j) enrty of AI is a i1 0 + + a i,j1 0 + a ij 1 + a i,j+1 0 + + a ik 0 = a ij So AI = A Similarly, I A Therefore matrix I acts like 1 in ordinary multiplication The fundamental scalar relation about the existence of an inverse number a 1 such that a 1 a = a a 1 = 1, if a 0, has the following matrix algebra extension B k k A k k = A k k B k k = I k k then B is called the inverse of A and is denoted by A 1 Other Matrix Properties 1 Trace: The sum of the diagonal elements, tr(a n n ) = n i=1 a ii 2 A square matrix that does not have a matrix inverse is called a singular matrix The inverse of a 2 2 matrix is given by a b A = A 1 1 d b = c d ad bc c a 3 A matrix is singular if and only if its determinant is 0 The determinant of a matrix A is denoted as A The determinant of a 2 2 matrix is given by a b A =, det(a) = ad bc c d
Biostat 140655, 2008: Matrix Algebra Review 4 Examples 1: Simultaneous equations 5x + 3y + z = 1 2x + 3y + 5z = 2 x + 9y + 6z = 3 We can rewrite the above three equations as a single matrix equation: 5 3 1 x 1 2 3 5 y = 2 1 9 6 z 3 Example 2: Variance/Covariance Matrix For a vector of random variables, (Y 1, Y 2,, Y n ), we can write a matrix containing their variances and their covariances Let σ 2 i be the variance of Y i and let cov ij be the covariance between Y i and Y j, i < j Then the variance/covariance matrix for (Y 1, Y 2,, Y n ) is σ1 2 cov 12 cov 1n cov 12 σ2 2 cov 2n cov 1n cov 2n σn 2 Also note that the above can be written as σ 1 0 0 1 ρ 12 ρ 1n σ 1 0 0 0 σ 2 0 ρ 12 1 ρ 2n 0 σ 2 0 0 0 σ n ρ 1n ρ 2n 1 0 0 σ n where ρ ij is the correlation of Y i and Y j Note that all of these matrices are symmetric Furthermore, the terms on the diagonal of the variance/covariance matrix must be positive and terms off the diagonal of the correlation matrix are bounded by -1 and 1 Example 3: Multiple Linear Regression We have a response Y and a set of p independnet variables X 1, X 2,, X p Assume we have n observations and for each i = 1,, n observation, we assume where ε i Normal (0, σ 2 ) Y i = β 0 + β 1 x i,1 + + β p x i,p + ε i
Biostat 140655, 2008: Matrix Algebra Review 5 We can represent n observations simultaneously in matrix form as Y = X n p+1 β p+1 1 + ε y 1 1 x 11 x 12 x 1p β 0 ε 1 y 2 Y = X = 1 x 21 x 22 x 2p β = β 1 ε = ε 2 y n 1 x n1 x n2 x np β p ε n The residual Y i (β 0 + β 1 x i,1 + + β p x i,p ) can be written as y 1 (β 0 + β 1 x 1,1 + β 2 x 1,2 + + β p x 1,p ) ( ) y 2 (β 0 + β 1 x 2,1 + β 2 x 2,2 + + β p x 2,p ) Y Xβ = y n (β 0 + β 1 x n,1 + β 2 x n,2 + + β p x n,p ) The residual sum of squares (RSS) is y 1 β 0 β 1 x 1,1 β 2 x 1,2 β p x 1,p ( ) T ( ) y 2 β 0 β 1 x 2,1 β 2 x 2,2 β p x 2,p Y Xβ Y Xβ = y n β 0 β 1 x n,1 β 2 x n,2 β p x n,p y 1 β 0 β 1 x 1,1 β 2 x 1,2 β p x 1,p y 2 β 0 β 1 x 2,1 β 2 x 2,2 β p x 2,p y n β 0 β 1 x n,1 β 2 x n,2 β p x n,p n = (Y i β 0 β 1 x i,1 β p x i,p ) 2 i=1 Minimizing the above RSS gives the usual least-squared estimate for β ˆβ OLS = argmin β (Y Xβ ) T ( ) Y Xβ = (X T X) 1 X T Y T