Matrix Theory, Math634 Lecture Notes from September 27, 212 taken by Tasadduk Chowdhury Last Time (9/25/12): QR factorization: any matrix A M n has a QR factorization: A = QR, whereq is unitary and R is upper triangular. In addition, we proved that if A is non-singular, Q and R are unique. Cholesky factorization: any B M n satisfying B = A A, A M n has the factorization B = LL where L M n is lower triangular and non-negative on the diagonal. QR algorithm: details and convergence criteria. Warm-up If A = A M n is normal and QR algorithm converges (entry-wise), what can we say about the limiting matrix A? Convergence implies that there is a unitary Q and a upper-triangular R such that From above, we see that Q and R commute. Also, So Q and A also commute. Moreover, from (1) we get A = Q R = R Q. (1) A Q = Q R Q = Q A. (2) Q A = R = R Q Q = A Q. (3) By taking adjoints, we get A Q = Q A from (2) and Q A = A Q from (3). Thus, R R = Q A A Q = A Q Q A (by commutativity) = A A = A A (by normality of A ) = A Q Q A = Q A A Q (by commutativity) = R R. 1
(We claimed above that A is normal. From last lecture s proposition of QR algorithm we know that A n s are all unitarily equivalent. It can be checked that any matrix unitarily equivalent to a normal matrix is also normal: say X and Y are unitarily equivalent, and X is normal. Then, Y Y =(U XU) (U XU) = U X UU XU = U X XU = U XX U = U XUU X U = YY, and thus Y is normal. Hence, since A n s are unitarily equivalent to A = A, anda is normal, and thus the limiting matrix A is also normal.) By normality of R and the fact that it is triangular, R = D with D adiagonalmatrixand A = D Q = D Q. Since {Q,D } is a commuting family in M n,thereexistsaunitaryu M n that diagonalizes both: U D U = D, and U Q U = Q, where Q is a diagonal matrix. We will denote the i th diagonal entry of of Q by ω i. Note that ω i =1,sincetheeigenvaluesofanunitarymatrixare±1, andtheeigenvaluesofq are the diagonal entries of Q.Hence,weobtainthefollowingresult: U D Q U = U D UU Q U = D Q d 11... ω 1.... d = 22.... ω 2.................... d nn... ω n d 11 ω 1.... d = 22 ω 2.............. d nn ω n Since the matrix D Q commutes with A,andA is unitarily equivalent to A (follows from the QR algorithm), D Q is unitarily equivalent to A. Soλ j = d jj ω j are the eigenvalues of A. Thus, we know d jj = λ j.moreover,ifthediagonalentriesofd are distinct (magnitudes of the eigenvalues of A are distinct), and since D Q = Q D,theoffdiagonalentriesofQ must be zeros, and we precisely have Q = Q.Thus,Q is diagonal and unitary, and d 11 ω 1... d A = 22 ω 2............... d nn ω n 2
So if the magnitudes of a normal matrix A are distinct and if the QR algorithm converges, then it diagonalizes A. 1 Real Matrices (cont d) We proceed to show under which conditions a matrix with entries over R can be diagonalized the same way as matrices with complex entries. 1.5.1 Definition. AmatrixA M n (R) is similar to B M n (R) if there exists invertible S M n (R) such that B = S 1 AS. The matrix A is diagonalizable if A is similar to a diagonal matrix. 1.5.2 Theorem. A M n (R) is diagonalizable if and only if there is a set of n linearly independent eigenvectors. Proof. As before, S 1 AS = D, withd diagonal implies that S =[x 1,x 2,,x n ] contains a basis of n eigenvectors and vice versa. 1.5.3 Theorem. If A M n (R) has n distinct (real) eigenvalues, then it is diagonalizable. Proof. As before, we use the fact that eigenvectors belonging to distinct eigenvalues are linearly independent. 1.5.4 Theorem. AmatrixA M n (R) is diagonalizable if and only if it has n eigenvalues (with multiplicities counted) and the geometric and algebraic eigenvalues are equal. Proof. Extends preceding theorem, same strategy as before. The following theorem is the real case of Schur s triangularization theorem. 1.5.5 Theorem. If A M n (R) has n (real) eigenvalues (counting multiplicity), then there exists an orthogonal matrix O M n (R) such that O t AO = T, where T is triangular, and the eigenvalues of A are the diagonal entries of T. Proof. This proof is identical to the proof of Schur s theorem in the complex case. We prove it by induction on the dimension n. Forn =1,itistriviallytrue. Now suppose the theorem holds for all matrices in M n 1 (R). LetA M n (R) has real eigenvalues λ 1,λ 2,,λ n. Choose an eigenvector x R n, x =1. Now x may be extended to a basis {x, y 2,,y n } of R n. Apply Gram-Schmidt orthonormalization to this basis to produce an orthonormal basis {x, z 2,,z n } of R n.definethematrixo 1 =[x, z 2,,z n ].NotethatO 1 is orthogonal, since its columns are orthonormal. Thus, O1AO t 1 = O1 t λ1 x 1 λ 1 =. B, 3
where B M n 1 (R). NowthecharacteristicpolynomialofA factors as p A (t) =(t λ 1 )p B (t), where p B (t) is the characteristic polynomial of B. Thismeansthatλ 2,,λ n are the eigenvalues of B. Now,bytheinductionassumption,thereexistsanorthogonal O Mn 1 (R) such that O t B T, where T is upper triangular with λ 2,,λ n on the diagonal. Define 1 O 2 =. O Note that O 2 is orthogonal (its columns are orthonormal). Let O = O 1 O 2. O is orthogonal since Then O t =(O 1 O 2 ) t = O2O t 1 t = O2 1 O1 1 =(O 1 O 2 ) 1 = O 1. O T AO =(O 1 O 2 ) t AO 1 O 2 = O2(O t 1AO t 1 )O 2 λ 1 = O2 t. B O 2 λ 1 λ 1 =. O t BO =. T = T. Thus, we have triangularized A to the matrix T which has the eigenvalues of A in its diagonal. 1.5.6 Theorem. If A M n (R) is symmetric, then there exists an orthogonal matrix O M n (R) such that O t AO = D, where D is a diagonal matrix containing the eigenvalues of A as its diagonal entries. 4
Proof. By symmetry, all eigenvalues of A are real (previously proven). This would also apply to all (complex) eigenvalues obtained from factoring the characteristic polynomial p A of A over C. This emplies A has n real eigenvalues. Consequently, Schur s triangularization (over R) gives orthogonal O, O t AO = T, with T triangular. But T is normal and triangular, thus T is diagonal. 1.5.7 Question. What if A is not symmetric? 1.6 Block Triangularization 1.6.8 Theorem. If A M n (R), thenthereexitsarealorthogonalmatrixo M n (R) such that A 1 O t A 2 AO =......., A r with diagonal blocks A j M 1 (R) or M 2 (R). Proof. Repeat Schur s triangularization procedure. To begin with, if λ 1 is real, then there is a (real) x 1 R n and Ax 1 = λ 1 x 1.Sonormalizingx 1,complementingtoorthonormalbasisofR n viewed as columns gives A x 1 = λ 1 x 1, and λ 1 x t 1 t A x 1 =......, and we proceed with a lower right block as before. If λ 1 = α + iβ, α, β R, β=,thenthere is an eigenvalue λ 2 such that λ 2 = λ 1 = α iβ. Why? Take an eigenvector x C n belonging to eigenvalue λ, takerealandimaginaryparts: x = u + iv where u, v R n.then Ax = A(u + iv) = Au + iav = λ(u + iv) =(α + iβ)(u + iv) Comparing real and imaginary portions we have: = αu βv + iβu + iαv. Au = αu βv and Av = βu + αv. 5
Consider x = u iv, then A(u iv) =αu βv i(βu + αv) =(α iβ)(u iv). We conclude x is an eigenvalue belonging to eigenvalue λ. Since λ = λ, {x, x} is linearly independent. This means that {u, v} is also linearly independent since u = 1 (x + x) and 2 v = 1 (x x). ByGram-Schmidt,wecanfindanorthonormalsystemwithsame(real)spanas 2i {u, v}, say{z,w}. Sincespan(u, v) is an invariant subspace, complementing to an orthonormal basis of R n viewed as columns, we get A z w = a 11 z + a 12 z a 21 w + a 22 w, and z t w t A z w A1 =, t where A 1 =,andintheblockpartitionedmatrixabove, s and s denote the matrices of proper dimensions. Iterating this procedure, we arrive at the claimed block-triangular form. 6