Linear Algebra Primer D.S. Stutts November 8, 995 Introduction This primer was written to provide a brief overview of the main concepts and methods in elementary linear algebra. It was not intended to take the place of any of the many elementary linear algebra texts in the market. It conatins relatively few examples and no exercises. The interested reader will find more in depth coverage of these topics in introductory text books. Much of the material including the order in which it is presened comes from Howard Anton s Elementary Linear Algebra 2 nd Ed., John Wiley, 977. Additional texts include Advanced Engineering Mathematics, by C.R. Wylie and L.C. Barrett, McGraw Hill. There is a similar text by the same title by E. Kreyszig, John Wiley. A more advanced text is Linear Algebra and its Applications by Gilbert Strang, Academic Press. The author hopes that this primer will answer some of your questions as they arise, and provide some motivation (prime the pump, so to speak for you to explore the subject in more depth. At the very least, you now have a list (albiet a short one of references from which to obtain more in depth explanation. It should be noted that the examples given here have been motivated by the solution of consistent systems of equations which have an equal number of unknowns and equations. Therefore, only the analysis of square (n by n matrices have been presented. Also, for clarity of notation, bold symbols are used to denote vectors and matrices. Non-bold symbols are used to denote scalar quantities. 2 Linear Systems of Equations The following systems of equations a x + a 2 x 2 + + a n x n b a 2 x 2 + a 22 x 22 + + a 2n x 2n b 2. a n x n + a n2 x n2 + + a nn x nn b n ( may be written in the matrix form as or a a 2 a n a 2 a 22 a 2n. a n a n2 a nn x x 2. x n b b 2. b n (2 AX b (3 where A is a n by n matrix b is a n by matrix or vector
3 Matrix Properties For any matrices A, B, and C, the following hold : Provided that AB and AC are defined. A + B B + A 2. A + (B + C (A + B + C 3. A(B + C AB + AC Definition AB is defined if B has the same number of rows as A has columns. 4. AI IA A 5. OA AO O 6. A + O A 3. Indicial Notation A matrix A may be described by indicial notation. The term located at the i th row and j th column is denoted by the scalar a ij. Thus, A + B a ij + b ij Example A (2 2 + B (2 2 (a + b (a 2 + b 2 (a 2 + b 2 (a 22 + b 22 4 Multiplication of a matrix Multiplication by a scalar αa αa ij α a a 2 a 2 a 22 αa αa 2 αa 2 αa 22 Multiplication by a matrix : C AB Σ n k a ikb kj c ij Example 2 a a 2 b b 2 a 2 a 22 b 2 b 22 a b + a 2 b 2 a b 2 + a 2 b 22 a 2 b + a 22 b 2 a 2 b 2 + a 22 b 22 2
Note in general AB BA Multiplication by a vector : ( ( a a AX 2 x a x + a 2 x 2 a 2 a 22 x 2 a 2 x + a 22 x 2 4. Additional Properties.... A A I if A exists.. ( Identity matrix.......... 4.2 The determinant Operation Recall from dynamics that the three space (dimensions the vector product of any two vectors A (a i + a 2 j + a 3 k and B (b i + b 2 j + b 3 k is defined as A B det i j k a a 2 a 3 b b 2 b 3 (a 2 b 3 a 3 b 2 i (a b 3 a 3 b j + (a b 2 a 2 b k (4 This method of computing the determinant is called cofactor expansion. A cofactor is the signed minor of a given element in a matrix. A minor M ij is the determinant of the submatrix which remains after the i th row and the j th column of the matrix are deleted. M (a 2 b 3 a 3 b 2 M 2 (a b 3 a 3 b (5 M 3 (a b 2 a 2 b The cofactors are given by Hence, c ij ( i+j M ij (6 In the above example, Example 3 Let A det 2 3 2 C ( + M M C 2 ( +2 M 2 M 2 (7 i j k a a 2 a 3 b b 2 b 3 etc. (8 a C + a 2 C 2 + a 3 C 3 (9 deta 2( 2 (3 7 3
Example 4 Let A Then, deta 2 3 2 2 2( 2(2 ( ( (3(2 ( + (3( 5( 2( 2( 3 5 Expansion by cofactors is mostly useful for small matrices (less than 4*4. For larger matrices, the number of operations becomes prohibitively large. For example: A 2 by 2 matrix requires detreminant operation A 3 by 3 matrix requires 3 detreminant operation However, a 4 by 4 matrix requires the computation of 4! 24 signed elementary products. A by matrix would require! 3,628,8 signed elementary products! This trend suggests that soon even the largest and fastest computers would choke on such a computation. For large matrices, the determinant is best computed using row reduction Row reduction consists of using elementary row and column operations to reduce a matrix down to a simpler form, usually upper or lower triangular form. This is accomplished by multiplying one row by a constant and adding it another row to produce a zero at the desired position. 2 Example 5 Let A 3 2 2 Reduce A to upper triangular form, i.e., all zeros under the main diagnal (2-2 2. multiplying row by -/2 and adding it to row 3 yields 2 3 2.5 2 Similarly, multiplying row by -3/2 and adding it to row 2 yields 2 3.5.5 2 multiplying row 2 by.5 3.5 3 7 and adding it to row 3 2 7 2 7 The determinant is now easily computed by multiplying the elements of the main diagnal. deta 2( 7 2 ( 7 4
This type of row reduction is called Gaussian elimination and is much more efficient than the cofactor expansion technique for large matrices. 4.3 The transpose operation and its properties If A may be written as a ij, then its transpose is defined as a ij Properties (i (A t t A (ii (A + B t A t + B t (iii (ka t ka t (iv (AB t B t A t A matrix A is termed symmetric if A t A and skew-symmetric if A t A. 4.4 Properties of Determinant Operations. If A is a square matrix then det(a t det(a 2. det(ka k n det(a where A is an n*n matrix and k is a scalar 3. det(ab det(adet(b where A and B are square matrices of the same size. 4. A square matrix A is invertible if and only if det(a Proof : If A is invertible then AA I 5. If A is invertible then det(a det(a The proof follows from (4 det(a det(i det(a An important result from (4 is the following: For a given system of equations AX O 6. There exists a nontrivial X(X O if and only if det (A Proof: Assume det(a. This implies that A exists from (4. A AX IX. This implies that X (a contradiction, so the omly way that X can be other than zero is for det(a, and hence, for A not to exist The result given by (6 is used very often in applied mathematics, physics and engineering. adj(a C ij Another way to calculate the inverse of a matrix is by Gaussian elimination. This method is easier to apply on larger matrices Since A A I, we start with the matrix A which we want to invert on the left and the identity matrix on the right. We then do elementary row operations (Gaussian Elimination on the matrix while simultaneously doing the same operations on I. This can be accomplished by adjoining the two matrices to form a matrix of the form A I 7. If A is invertible then A det(a adj(a 5
Example 6 Adjoining A with I yields A 2 3 2 5 3 8 2 3 2 5 3 8 Adding -2 times the first row to the second row and - times the first row to the third yields Adding 2 times the 2 nd row to the third yields Multiplying the third row by - yields 2 3 3 2 2 5 2 3 3 2 5 2 2 3 3 2 5 2 Adding 3 times the third row to the second and -3 times the third row to the first yields 2 3 4 6 3 3 5 3 5 2 Finally adding -2 times the second row to the first yields : 4 6 9 3 5 3 5 2 Thus, A 4 6 9 3 5 3 5 2 8. Cramer s rule If AX B is a system of n linear equations in n unknowns such that det(a then, the system has a unique solution which is given by x det(a det(a, x 2 det(a 2 det(a, x 3 det(a 3 det(a 6
where A j is the matrix obtained by replacing the entries of the j th column A by the entries in the vector B, where b b 2... b n Example 7 Consider the following system: solve for x, x 2 and x 3 2 4 2 2 3 5 8 4 2 2 2 3 7 5 2 4 8 2 2 7 8 2 7 2x +4x 2 2x 3 8 x 2 8 2 2 3 7 5 2x 2 +3x 3 2 +5x 3 7 det(a 2(2(5 ( 2( 3(2( 4( 36 det(a ( 74(3 ( 2(2 + 58(2 4( 2 8 det(a 2 2 2(5 (3( 7 + 8(3 ( 2( 2 72 det(a 3 22( 7 ( 2( + 4( 2 8(2 72 Thus, x det(a det(a 8 36 3 x 2 det(a 2 det(a 73 36 2 x 3 det(a 3 det(a 73 36 2 Note that we took advantage of any zeros by expanding the cofactors along the rows or columns that contained them. Cramer s rule is particularly efficient to use on 2*2 system. Consider a general example a a 2 ( x ( b a 2 a 22 x 2 b 2 7
det(a a a 22 a 2 a 2 det(a b a 22 a 2 b 2 det(a 2 a b 2 b a 2 T hus, x det(a deta a 22b a 2 b 2 a a 22 a 2 a 2 x 2 det(a 2 det(a a b 2 b a 2 a a 22 a 2 a 2 One can easily show that Cramer s rule yields the same result as multiplying the original system of equations by A. For example, Given : AX b we have, A AX A b IX X A b 5 The Characteristic Polynomial and Eigenvalues and Eigenvectors We have shown that the system AX has a non-trivial solution if and only if det(a. A related type of problem is the following: Consider the system AX λx where λ is a scalar value. Equivalently, we may write (A λix O Hence, from theorem(6, there exists a nontrivial X if and only if det(a λi Evaluation of the above results in a polynomial in λ. This is the so called characteristic polynomial and its roots λ i are the characteristic values or eigenvalues. Evaluation of (9 yields, Furthermore, the solution of X i of is the eigenvector of the matrix A λ n + c λ n + c 2 λ n 2 + c n (A λ i IX i The result given by therem(9 is one of the most important in physics and engineering. It can be shown that the matrix A itself satisfies the characteristic polynomial. A n + c A n + c 2 A n 2 + + c n I This result is known as Cayley-Hamilton theorem. 8
Example 8 Find the eigenvalues and eigenvectors of 4 5 A 2 Solution : det ( 4 5 2 λ det 4 λ 5 2 λ (4 λ( 2 λ + 5 λ 2 2λ 3 or (λ 3(λ + Thus λ, λ 2 3 The eigenvector of A corresponding to λ - may be found as follows: ( ( 4 ( 5 x (A λix 2 ( x 2 or which has an obvious solution of 5 5 ( x x 2 ( X c ( where c is a scalar. Similarly, substituting λ 2 3 yields 5 5 ( x 2 x 2 2 ( Thus, X 2 c 2 ( 5 The calculation of eigenvalues and eigenvectors has many applications. One important application is the similarity transformation. Under certain conditions, the discussion of which is beyond the scope of this primer, a general system of equations may be transformed into a diagonal system. In other words, the system AX B may be transformed into an equivalent system. DY B where D is the diagonal matrix which makes the solution of DY B especially easy. Two matrices A and B are said to be similar if det(a det(b. Another way of saying the above is if A and B are square matrices B is similar to A if and only if there is an invertible matrix P such that A PBP It turns out that an n*n matrix A is diagonalizable if A has n linearly independent eigenvectors. 9
4 5 By theorem(2 our previous example A 2 should be diagonalizable since X and X 2 are linearly independent (i.e., X KX 2 for any scalar K. composed of X and X 2 as its two columns will diagonalize A. We prove that by trying it! D P AP /4 5/4 4 5 5 /4 /4 2 4 5 5 3 3 We see that D is composed of λ and λ 2 on its main diagonal and zeros elsewhere. 5. Symmetric Matrices 5.. Special properties A n n A t n n ( i Characteristic polynomial has n roots ( ii Eigen vectors corresponding to distinct real roots are orthogonal If Ax ( λ x ( Ax (2 λ 2 x (2 T hen, λ x (2t x ( x (2t Ax ( x (t Ax (2 λ 2 X (t x (2 λ 2 X (2t x (2 (iii If λ r is a root of the characteristic polynomial of algebric multiplicity m, there exist m independent eigenvectors corresponding to λ r Example 9 (symmetric A 4 4 5 Characteristc polynomial (λ 4 2 (λ 5 λ 4 λ 2 λ 3 5 λ 3; x ( λ 2 5; x (2 x (2
Example (non-symmetric A 5 4 5 3 Characteristc polynomial (λ 5 2 (λ 3 λ 3 λ 2 5 λ 3; x ( λ 2 5; x (2 only one eigenvector corresponding to λ 5