BASIC ALGORITHMS IN LINEAR ALGEBRA STEVEN DALE CUTKOSKY Matrices and Applications of Gaussian Elimination Systems of Equations Suppose that A is an n n matrix with coefficents in a field F, and x = (x,, x n ) T F n Let v w = v T w be the dot product of the vectors v, w F n Writing A = (A, A 2,, A n ) where A i F m are the columns of A, we obtain the formula Ax = x A + + x n A n Writing A A 2 A = Ax = A m where A j F n are the rows of A, we obtain the formula A x A T A 2 x x A T 2 x A m x = A T m x 2 Computation of the inverse of a matrix Suppose that A is an n n matrix Transform the n 2n matrix (A I n ) into a reduced row echelon form (C B) A is invertible iff C = I n If A is invertible, then B = A 3 Computation of a basis of the span of a set of row vectors Suppose that v,, v m F n Transform the m n matrix v v 2 v m into a reduced row echelon form B The nonzero rows of B form a basis of Span({v,, v m }) 4 Computation of a subset of a set of column vectors which is a basis of the span of the set Suppose that w,, w n F m Transform the m n matrix (w, w 2,, w n ) into a reduced row echelon form B Let σ() < σ(2) < < σ(r) be the indices of the columns B i of B which contain a leading Then {w σ(),, w σ(r) } is a basis of Span({w, w 2,, w n })
5 Extension of a set of linearly independent row vectors to a basis of F n Suppose that w,, w m F n are linearly independent Let {e,, e n } be the standard basis of F n Transform the m n matrix w w 2 w m into a reduced row echelon form B Let σ() < σ(2) < < σ(n m) be the indexes of the columns of B which do not contain a leading Then {w,, w m, e σ(),, e σ(n m) } is a basis of F n (Some different algorithms are given later in the pages on inner product spaces) 6 Computation of a basis of the solution space of a homogeneous system of equations Let A = (a ij ) be an m n matrix, and X = (x i ) be a n matrix of indeterminates Let N(A) be the null space of the matrix A (the subspace of F n of all X F n such that AX = 0 m ) A basis for N(A) can be found by solving the system AX = 0 m using Gaussian elimination to find the general solution, putting the general solution into a column vector and expanding with indeterminate coefficients The vectors in this expansion are a basis of N(A) 2
Calculation of the Matrix of a Linear Map Coordinate vectors Suppose that V is a vector space, with a basis β = {v,, v n } Suppose that v V Then there is a unique expansion v = c v + + c n v n with c i R The coordinate vector of v with respect to the basis β is (v) β = (c,, c n ) T M n 2 The transition matrix between bases Suppose that V is a vector space, and β = {v,, v n }, β = {w,, w n } are bases of V The transition matrix M β β from the basis β to the basis β is the unique n n matrix M β β for all v V It follows that M β β (v) β = (v) β M β β = ((v ) β, (v 2 ) β,, (v n ) β ) which has the property that We have that M β β = (M β β ), and if β is a third basis of V, then M β β M β β = M β β The n 2n matrix (w, w 2,, w n, v,, v n ) is transformed by elementary row operations into the reduced row echelon form (I n, M β β ) 3 The matrix of a linear map Suppose that F : V W is a linear map Let β = {v,, v n } be a basis of V, and β = {w,, w m } be a basis of W The matrix M β β (F ) of the linear map F with respect to the bases β of V and β of W is the unique m n matrix M β β (F ) which has the property that for all v V It follows that M β β (F )(v) β = (F (v)) β M β β (F ) = ((F (v )) β, (F (v 2 )) β,, (F (v n )) β ) If F is the identity map id (so that V = W ), then M β β (id) is the transition matrix M β β defined above Suppose that G : W X is a linear map, and β is a basis of X The composition G F : V X of F and G can be represented by the diagram V F W G X We have M β β (G F ) = M β β (G)M β β (F ) A particularly important application of this formula is M β β (F ) = S M β β (F )S, where F : V V is linear, β and β are bases of V, and S = M β β A convenient method for computing M β β (F ) is the following Let β be a basis of W which is easy to compute with (such as a standard basis of W ) The m (m + n) matrix ((w ) β, (w 2 ) β,, (w m ) β, (F (v )) β,, (F (v n )) β ) is transformed by elementary row operations into the reduced row echelon form (I m, M β β (F )) 3
Inner Product Spaces The Orthogonal Space Suppose that A is an m n matrix with coefficients in a field F R(A) is the column space of A, and N(A) is the solution space to Ax = 0 R(A) is a subspace of F m, which has a nondegenerate inner product given by the dot product v w = v T w for v, w F m For x R n, we have the formula and thus we have the formulas A A m x = A T x A T m x, N(A) = [R(A T )] and N(A T ) = R(A) 2 Pythagoras s Theorem Suppose that V is a finite dimensional real vector space with positive definite inner product <, > Suppose that v, w V and v w Then v + w 2 = v 2 + w 2 3 The Gram Schmidt Process Suppose that V is a finite dimensional real vector space with positive definite inner product <, >, and that {x,, x n } is a basis of V Let v = x u = x x v 2 = x 2 < x 2, u > u u 2 = v 2 v 2 v 3 = x 3 < x 3, u > u < x 3, u 2 > u 2 u 3 = v 3 v 3 Then {u, u 2,, u n } is an orthonormal (ON) basis of V 4 Coordinate Vector With Respect to an ON Basis Suppose that V is a finite dimensional real vector space with positive definite inner product <, > Suppose that β = {u,, u n } is an ON basis of V and v V Then (v) β = (< v, v >, < v, v 2 >,, < v, v n >) T 5 Projection Onto a Subspace Suppose that V is a finite dimensional real vector space with positive definite inner product <, >, and that W is a subspace of V Then V = W W ; that is, every element v V has a unique decomposition V = w + w with w W and w W This allows us to define the projection π W : V W by π W (v) = w π W is a linear map onto W π W (v) is the element of W which is the closest to v; if x W and x π W (v) then v π W (v) < v x 4
π W (v) can be computed as follows Let {u,, u s } be an orthonormal basis of W For v V, let c i =< v, u i > be the component of v along u i Then s π W (v) = c k u k Now let us restrict to the case where V is R n with < v, w >= v T w Let {u,, u s } be an orthonormal basis of W Let U be the n s matrix U = (u,, u s ) Let A = UU T, an n n matrix The linear map L A : R n R n is the projection π W : R n W, followed by inclusion of W into R n 6 Orthogonal Matrices An n n real matrix A is orthogonal if A T A = I n In the following theorem, we view R n as an inner product space with the dot product Theorem 0 The following are equivalent for an n n real matrix A ) A is an orthogonal matrix 2) A T = A 3) The linear map L A : R n R n preserves length ( Ax = x for all x R n ) 4) The columns of A form an ON basis of R n k= 7 Least Squares Solutions Let A be an m n real matrix and b R m A least squares solution of the system Ax = b is a vector x = ˆx R n which minimizes b Ax for x R n The least squares solutions of the system Ax = b are the solutions to the (consistent) system A T Ax = A T b 8 Fourier Series Let C[ π, π] be the continuous (real valued) functions on [ π, π] C[ π, π] is a real vector space, with the positive definite inner product < f, g >= π π π f(t)g(t)dt The norm of f C[ π, π] is defined by f 2 =< f, f > For a positive integer n, let T n be the subspace of C[ π, π] which has the orthonormal basis { 2, sin(t), cos(t), sin(2t), cos(2t),, sin(nt), cos(nt)} Suppose that f C[ π, π] The projection of f on T n (T is for trigonometric functions ) is f n (t) = a 0 2 + b sin(t) + c cos(t) + + b n sin(t) + c n cos(nt) where and for k n, a 0 =< f(t), >= π 2 2π b k =< f(t), sin(kt) >= π c k =< f(t), cos(kt) >= π 5 π π π π π f(t)dt f(t) sin(kt)dt, f(t) cos(kt)dt
a 0, b k, c k are called the Fourier coefficients of f f n is the best approximation of f in T n, in the sense that g = f n minimizes f g for g T n The infinite series g(t) = a 0 2 + b sin(t) + c cos(t) + + b n sin(t) + c n cos(nt) + converges to f on the interval [ π, π] g(t) is defined everywhere on R g(t) is periodic of period 2π; that is g(a + 2π) = g(a) for all a R Thus in general, g(t) will only be equal to f(t) on the interval [ π, π] There is an infinite Parseval s formula, f 2 = a 2 0 + b 2 + c 2 + 9 Extension of a set of LI vectors to a basis of R n Let v,, v s be a set of linearly independent vectors in R n Let v T v T 2 A = vs T Let {v s+,, v n } be a basis of N(A) (which can be computed using Gaussian elimination) Then {v,, v s, v s+,, v n } is a basis of R n Warning: This algorithm does not work in C n The reason is that C n might not be equal to W W if W is a subspace of C n To fix this problem, we need the notion of Hermitian inner product An extension of this algorithm that works over any subfield F of C will be given below in 6
0 Hermitian Inner Product Spaces Suppose that V is a complex vector space Then the notion of positive definite inner product is generalized to that of Hermitian inner product (warning: an Hermitian inner prouduct is not a bilinear form, so it is not a nondegenerate inner product) The dot product on C n is not Hermitian C n has the Hermitian inner product < v, w >= v T w for v, w C n (here w is the complex conjugate of w) The statements of through 9 above all generalize to Hermitian inner products (with R replaced by C) The statement of for a complex matrix A becomes A A m x = < A T, x > < A T m, x > = < x, A T > < x, A m T > and thus R(A T ) = N(A) and R(A) = N(A T ) The projection matrix A of 5 becomes A = UU T The orthogonal matrix defined in 6 generalizes to a unitary matrix An n n complex matrix A is unitary if A T A = I n The criterion of 2) of the theorem of 6 then becomes A T = A The least squares solutions to Ax = b are the solutions to A T Ax = A T b The inner product in 8 becomes < f, g >= π π π f(t)g(t)dt Extension of a set of LI vectors to a basis of F n Let F be a subfield of C, and let v,, v s be a set of linearly independent vectors in F n Let v T v T 2 A = v T s Let {v s+,, v n } be a basis of N(A) (which can be computed using Gaussian elimination) Then {v,, v s, v s+,, v n } is a basis of F n, 7
Eigenvalues and Diagonalization Eigenvalues and Eigenvectors Suppose that A M n n (F ) is an n n matrix λ F is an eigenvalue of A if there exists a nonzero vector v F n such that Av = λv Such a v is called an eigenvector of A with eigenvalue λ For λ F, E(λ) = {v F n Av = λv} is a subspace of F n λ is an eigenvalue of A if and only if E(λ) {0} The nonzero elements of E(λ) are the eigenvectors of A with eigenvalue λ If λ is an eigenvalue of A, then E(λ) is called an eigenspace of A Thus A is not invertible if and only if λ = 0 is an eigenvalue of A The eigenspace E(λ) is the solution space N(A λi n ) 2 The Characteristic Polynomial The characteristic polynomial of A M n n (F ) is P A (t) = Det(tI n A) Observe that P A (t) = ( ) n Det(A ti n ) The roots of P A (t) = 0 are the eigenvalues of A 3 Diagonalization of Matrices Suppose that A M n n (F ) We say that A is diagonalizable (over F ) if A is similar to a diagonal matrix; that is, there exists an invertible n n matrix B M n n (F ) such that B AB = D is a diagonal matrix Let β = {e,, e n } be the standard basis of F n By 2, we have that a matrix A M n n (F ) has only finitely many distinct eigenvalues, say λ, λ 2,, λ r Suppose that dim E(λ i ) = s i for i r For i r, let v i,,, v i,si be a basis of E(λ i ) Then v,,, v,s, v 2,,, v r,sr is a linearly independent set of vectors (It can be proven that if w,, w s are eigenvectors for A with distinct eigenvalues, then w + w 2 + + w s = 0 implies w i = 0 for all i) If they form a basis β of F n, then we have an equation λ M β β AM β λ β = D = λ 2 where all nondiagonal entries of D are zero Thus we have diagonalized A The matrix M β β = (v,,, v,s, v 2,,, v r,sr ) and M β β = (M β β ) Working backwards through this construction, we see that an n n matrix A is diagonalizable over F if and only if F n has a basis of eigenvectors of A In summary, we always have that s + + s r n, and A is diagonalizable if and only if s + + s r = n 4 Eigenvalues and Diagonalization of Operators Everything above generalizes to an operator T : V V, where V is an n dimensional vector space over a field F λ F is an eigenvalue of T if there exists a nonzero vector v V such that T v = λv Such a v is called an eigenvector of T with eigenvalue λ We can then form the eigenspace E(λ) of an eigenvalue λ of T, which is a subspace of V Suppose that β is a basis of V Then we can compute the matrix M β β (T ) of T with respect to the basis β Further, we can compute 8 λ r
the characteristic polynomial P M β β (T )(t) of M β β (T ) This polynomial is independent of the choice of basis β of V Thus we can define the characteristic polynomial of T to be P T (t) = P M β β (T )(t), computed from any choice of basis β of V We have that the roots of P T (t) = 0 are the eigenvalues of T We say that T is diagonalizable if there exists a basis β of V consisting of eigenvectors of T In this case, the matrix M β β (T ) is a diagonal matrix 5 Diagonalization of Real Symmetric Matrices Suppose that A M n n (R) is a symmetric matrix Then the spectral theorem tells us that all eigenvalues of A are real and that R n has a basis of eigenvectors of A Further, eigenvectors with distinct eigenvalues are perpendicular Thus R n has an orthonormal basis of eigenvectors This means that we may refine our diagonalization algorithm of 3, adding an extra step, using Gram Schmidt to obtain an ON basis u i,,, u i,si of E(λ i ) from the basis v i,,, v i,si Since eigenvectors with distinct eigenvalues are perpendicular, we may put all of these ON sets of vectors together to obtain an ON basis u,,, u,s, u 2,,, u r,sr of R n Let U = (u,,, u,s, u 2,,, u r,sr ) U is an orthogonal matrix, so that U = U T We have orthogonally diagonalized A, λ U T λ AU = D = λ 2 where all nondiagonal entries of D are zero, and U is an orthogonal matrix λ r 9
6 Triangularization of Matrices Suppose that A M n,n (F ) A triangularization of A (over F ) is a factorization P AP = T where P, T M n,n (F ), P is invertible and T is upper triangular A is triangularizable over F if and only if all of the eigenvalues of A are in F (this will always be true if F = C is the complex numbers) The following algorithm produces a triangularization of A Let v,, v s be a maximal set of linearly independent eigenvectors for A, with respective eigenvalues λ, λ 2,, λ s Extend {v,, v s } to a basis v,, v n of F n (This can be done for any F by algorithm 5 in Matrices and applications, or by algorithm 9 or its extension algorithm in Inner product spaces if F is contained in R or C) Then P = (v, v 2,, v n ) satisfies P AP = λ 0 0 0 λ 2 0 0 0 λ s 0 0 0 B, where B is an (n s) (n s) matrix 2 The eigenvalues of B are a subset of the eigenvalues of A 3 If Q BQ = S is an upper triangular matrix (Q triangularizes B), then ( ) Is 0 P 2 = s (n s) 0 (n s) (n s) Q triangularizes A 0
Jordan Form For λ C, the Jordan block B n (λ) is the n n matrix λ 0 0 0 0 0 λ 0 0 0 B n (λ) = 0 0 0 0 λ 0 0 0 0 0 λ B n (λ) has the characteristic polynomial P A (t) = Det(tI n B n (λ)) = (t λ) n The only eigenvalue of B n (λ) is λ The eigenspace of λ for B n (λ) is the solution space to (B n (λ) λi n )X = 0 0 0 0 0 0 0 0 0 0 0 B n (λ) λi n = 0 0 0 0 0 0 0 0 0 0 0 So the solutions are x 2 = x 3 = = x n = 0, and a basis of the eigenspace E(λ) of B n (λ) consists of the single vector 0 0 0 0 In particular, B n (λ) is diagonalizable if and only if n = In this special case, B (λ) = (λ) A Jordan Matrix J is a matrix B n (λ ) 0 0 0 J = 0 B nr (λ ) 0 0 0 0 B n2 (λ 2 ) 0 0 0 0 B nsrs (λ s ) where J is a block (partitioned) matrix whose diagonal elements are the Jordan blocks B nij (λ i ) Set t i = n i + n i2 + + n iri for i s J is an n n matrix where n = t + t 2 + + t s The characteristic polynomial of J is P J (t) = Det(tI n J) = (t λ ) t (t λ 2 ) t2 (t λ s ) ts
The eigenvalues of J are λ,, λ s Let e(i) be the column vector of length n with a in the ith place and zeros everywhere else A basis for E(λ ) is A basis for E(λ 2 ) is and a basis of E(λ s ) is {e(), e(n + ),, e(n + + n,r + )} {e(t + ),, e(t + n 2 + + n 2,r2 + )} {e(t + + t s + ),, e(t + + t s + n s + + n s,rs + )} In particular, E(λ i ) has dimension r i, the number of Jordan blocks of J with eigenvalue λ i Example A = A is a Jordan matrix with 3 Jordan blocks: ( ) 3, (2), 0 3 3 0 0 0 0 0 3 0 0 0 0 0 0 2 0 0 0 0 0 0 2 0 0 0 0 0 2 0 0 0 0 0 2 2 0 0 2 0 0 2 Theorem 02 Every square matrix A with complex coefficients is similar to a Jordan Matrix J; that is, there is an invertible complex matrix C such that J is called a Jordan form of A J = C AC The Jordan form of a matrix A is uniquely determined, up to permuting the Jordan blocks of a Jordan form This theorem fails over the reals Even if A is a real matrix, it will in general not be similar to a real Jordan matrix The essential point that makes everything work out over the complex numbers is the fundamental theorem of algebra which states that a nonconstant polynomial with complex coefficients has a complex root, so that it must factor into a product of linear factors (with complex coefficients) Thus every complex matrix has a complex eigenvalue (since the characteristic polynomial must have a complex root) However, there are real matrices which do not have a real eigenvalue Example 2 Suppose that P A (t) = (t 2) 2 (t + 3) 2 Then A has (up to permuting Jordan blocks) one of the following Jordan forms: 2 0 0 0 2 0 0 F = 0 2 0 0 0 0 3 0, F 2 = 0 2 0 0 0 0 3 0, 0 0 0 3 0 0 0 3 2
F 3 = 2 0 0 0 0 2 0 0 0 0 3 0 0 0 3, F 4 = 2 0 0 0 2 0 0 0 0 3 0 0 0 3 Suppose that A is an n n matrix with complex coefficients Let J be a Jordan form of A (with all of the above notation), so that P A (t) = P J (t) There is a factorization P A (t) = (t λ ) t (t λ 2 ) t2 (t λ s ) ts where λ i are the distinct complex eigenvalues of A, and t +t 2 + +t s = n The algebraic multiplicity of A for λ i is t i, and the geometric multiplicity of A for λ i is dim E(λ i ), the dimension of the eigenspace of λ i for A For each eigenvalue λ i of A, we have dim E(λ i ) t i A is diagonalizable if and only if we have equality of the algebraic and geometric multiplicities for all eigenvalues λ i of A A polynomial f(t) C[t] is monic if its leading coefficient is ; that is, f(t) has the form f(t) = t n + a n t n + + a 0 with a 0, a,, a n C The minimal polynomial q A (t) of A is the (unique) monic polynomial in C[t] which has the property that q A (A) = 0, and if f(t) C[t] satisfies f(a) = 0, then q A (t) divides f(t) If B is similar to A, then q B (t) = q A (t), so that q A (t) = q J (t) Let ϕ(i) = max{n ij i r i } Then q A (t) = (t λ ) ϕ() (t λ 2 ) ϕ(2) (t λ s ) ϕ(s) The Cayley-Hamilton theorem tells us that p A (A) = 0, so that q A (t) divides p A (t) This gives us a method of computing q A (t) Assuming that we are able to factor the characteristic polynomial of a matrix A, we can thus calculate fairly easily a lot of information about the Jordan form For matrices of small size, just knowing the characteristic polynomial, the minimal polynomial, and the geometric multiplicities will often uniquely determine the Jordan form Of course, this is not enough information to compute the Jordan form for general matrices! Exercises on Jordan Form Which of the following are Jordan matrices? ( ) 0, 0 0 0 3, 3 0 0 3 0 0 0 0 3 0 0 2 0 0 3 0 0 3, 0 0 0 0 0 0 2 0 0 0 0 2 0 0 0 0 0 0 2 What are the possible Jordan forms of A (up to permutation of Jordan blocks) if A has the given characteristic polynomial? a) P A (t) = (t 4) 2 t(t + 2) 2 b) P A (t) = (t 2) 3 3 Suppose that A is a 4 4 matrix with eigenvalues 2 and 5 Suppose that E(2) has dimension and E(5) has dimension 3 What are the possible Jordan forms of A (up to permutation of Jordan blocks)? 3,