AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 16: Eigenvalue Problems; Similarity Transformations Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 18

Eigenvalue and Eigenvectors Eigenvalue problem of m m matrix A is Ax = λx with eigenvalues λ and eigenvectors x (nonzero) The set of all the eigenvalues of A is the spectrum of A Eigenvalue are generally used where a matrix is to be compounded iteratively Eigenvalues are useful for algorithmic and physical reasons Algorithmically, eigenvalue analysis can reduce a coupled system to a collection of scalar problems Physically, eigenvalue analysis can be used to study resonance of musical instruments and stability of physical systems Xiangmin Jiao Numerical Analysis I 2 / 18

Eigenvalue Decomposition Eigenvalue decomposition of A is A = X ΛX 1 or AX = X Λ with eigenvectors x i as columns of X and eigenvalues λ i along diagonal of Λ. Alternatively, Ax i = λ i x i Eigenvalue decomposition is change of basis to eigenvector coordinates Ax = b (X 1 b) = Λ(X 1 x) Note that eigenvalue decomposition may not exist Question: How does eigenvalue decomposition differ from SVD? Xiangmin Jiao Numerical Analysis I 3 / 18

Geometric Multiplicity Eigenvectors corresponding to a single eigenvalue λ form an eigenspace E λ C m m Eigenspace is invariant in that AE λ E λ Dimension of E λ is the maximum number of linearly independent eigenvectors that can be found Geometric multiplicity of λ is dimension of E λ, i.e., dim(null(a λi )) Xiangmin Jiao Numerical Analysis I 4 / 18

Algebraic Multiplicity The characteristic polynomial of A is degree m polynomial p A (z) = det(zi A) = (z λ 1 )(z λ 2 ) (z λ m ) which is monic in that coefficient of z m is 1 λ is eigenvalue of A iff p A (λ) = 0 If λ is eigenvalue, then by definition, λx Ax = (λi A)x = 0, so (λi A) is singular and its determinant is 0 If (λi A) is singular, then for x null(λi A) we have λx Ax = 0 Algebraic multiplicity of λ is its multiplicity as a root of p A Any matrix A has m eigenvalues, counted with algebraic multiplicity Question: What are the eigenvalues of a triangular matrix? Question: How are geometric multiplicity and algebraic multiplicity related? Xiangmin Jiao Numerical Analysis I 5 / 18

Similarity Transformations The map A Y 1 AY is a similarity transformation of A for any nonsingular Y C m m Theorem A and B are similar if there is a similarity transformation B = Y 1 AY If Y is nonsingular, then A and Y 1 AY have the same characteristic polynomials, eigenvalues, and algebraic and geometric multiplicities. 1 For characteristic polynomial: det(zi Y 1 AY ) = det(y 1 (zi A)Y ) = det(zi A) so algebraic multiplicities remain the same 2 If x E λ for A, then Y 1 x is in eigenspace of Y 1 AY corresponding to λ, and vice versa, so geometric multiplicities remain the same Xiangmin Jiao Numerical Analysis I 6 / 18

Algebraic Multiplicity Geometric Multiplicity Let n be be geometric multiplicity of λ for A. Let ˆV C m n constitute of orthonormal basis of the E λ Extend ˆV to unitary V [ ˆV, Ṽ ] C m m and form [ ˆV B = V AV = A ˆV ˆV ] [ ] AṼ λi C Ṽ A ˆV Ṽ = AṼ 0 D det(zi B) = det(zi λi )det(zi D) = (z λ) n det(zi D), so the algebraic multiplicity of λ as an eigenvalue of B is n A and B are similar, so the algebraic multiplicity of λ as an eigenvalue of A is at least n Examples: 2 A = 2 2, B = 2 1 2 1 2 Their characteristic polynomial is (z 2) 3, so algebraic multiplicity of λ = 2 is 3. But geometric multiplicity of A is 3 and that of B is 1. Xiangmin Jiao Numerical Analysis I 7 / 18

Defective and Diagonalizable Matrices An eigenvalue of a matrix is defective if its algebraic multiplicity > its geometric multiplicity A matrix is defective if it has a defective eigenvalue. Otherwise, it is called nondefective. Theorem An m m matrix A is nondefective iff it has an eigenvalue decomposition A = X ΛX 1. ( ) Λ is nondefective, and A is similar to Λ, so A is nondefective. ( ) A nondefective matrix has m linearly independent eigenvectors. Take them as columns of X to obtain A = X ΛX 1. Nondefective matrices are therefore also said to be diagonalizable. Xiangmin Jiao Numerical Analysis I 8 / 18

Determinant and Trace Determinant of A is det(a) = m j=1 λ j, because det(a) = ( 1) m det( A) = ( 1) m p A (0) = Trace of A is tr(a) = m j=1 λ j, since p A (z) = det(zi A) = z m p A (z) = m (z λ j ) = z m j=1 m j=1 λ j m a jj z m 1 + O(z m 2 ) j=1 m λ j z m 1 + O(z m 2 ) Question: Are these results valid for defective or nondefective matrices? j=1 Xiangmin Jiao Numerical Analysis I 9 / 18

Unitary Diagonalization A matrix A is unitarily diagonalizable if A = QΛQ for a unitary matrix Q A hermitian matrix is unitarily diagonalizable, with real eigenvalues A matrix A is normal if A A = AA Examples of normal matrices include hermitian matrices, skew hermitian matrices hermitian matrix is normal and all eigenvalues are real skew hermitian matrix is normal and all eigenvalues are imaginary If A is both triangular and normal, then A is diagonal Unitarily diagonalizable normal is easy. Prove by induction using Schur factorization next Xiangmin Jiao Numerical Analysis I 10 / 18

Schur Factorization Schur factorization is A = QTQ, where Q is unitary and T is upper triangular Theorem Every square matrix A has a Schur factorization. Proof by induction on dimension of A. Case m = 1 is trivial. For m 2, let x be any unit eigenvector of A, with corresponding eigenvalue λ. Let U be unitary matrix with x as first column. Then [ ] U λ w AU =. 0 C By induction hypothesis, there is a Schur factorization T = V CV. Let [ ] [ 1 0 λ w ] V Q = U, T =, 0 V 0 T and then A = QTQ. Xiangmin Jiao Numerical Analysis I 11 / 18

Eigenvalue Revealing Factorization Eigenvalue-revealing factorization of square matrix A Diagonalization A = X ΛX 1 (nondefective A) Unitary Diagonalization A = QΛQ (normal A) Unitary triangularization (Schur factorization) A = QTQ (any A) Jordan normal form A = XJX 1, where J block diagonal with λ i 1. J i = λ.. i... 1 λi In general, Schur factorization is used, because Unitary matrices are involved, so algorithm tends to be more stable If A is normal, then Schur form is diagonal Xiangmin Jiao Numerical Analysis I 12 / 18

Simplification: Real Symmetric Matrices We will consider eigenvalue problems for real symmetric matrices, i.e. A = A T R m m, and Ax = λx for x R m Note: x = x T, and x = x T x A has real eigenvalues λ 1,λ 2,..., λ m and orthonormal eigenvectors q 1, q 2,..., q m, where q j = 1 Eigenvalues are often also ordered in a particular way (e.g., ordered from large to small in magnitude) Xiangmin Jiao Numerical Analysis I 13 / 18

Solving Eigenvalue Problems Eigenvalue-revealing factorization cannot be done in finite number of steps: Any eigenvalue solver must be iterative To see this, consider a general polynomial of degree m p(z) = z m + a m 1 z m 1 + + a 1 z + a 0 There is no closed-form expression for the roots of p: (Abel, 1842) In general, the roots of polynomial equations higher than fourth degree cannot be written in terms of a finite number of operations Xiangmin Jiao Numerical Analysis I 14 / 18

A Fundamental Difficulty Cont d The roots of p A are the eigenvalues of the companion matrix 0 a 0 1 0 a 1 A = 1....... 0 am 2 1 a m 1 Therefore, in general, we cannot find the eigenvalues of a matrix in a finite number of steps In practice, however, there are algorithms that converge to desired precision in a few iterations An obvious method is to find roots of characteristic polynomial p A (λ), but it is very ill-conditioned Xiangmin Jiao Numerical Analysis I 15 / 18

Power Iteration Simple power iteration for largest eigenvalue Algorithm: Power Iteration v (0) =some unit-length vector for k = 1, 2,... w = Av (k 1) v (k) = w/ w λ (k) = r(v (k) ) = (v (k) ) T Av (k) Termination condition is omitted for simplicity Xiangmin Jiao Numerical Analysis I 16 / 18

Convergence of Power Iteration Expand initial v (0) in orthonormal eigenvectors q i, and apply A k : v (0) = a 1 q 1 + a 2 q 2 + + a m q m v (k) = c k A k v (0) = c k (a 1 λ k 1q 1 + a 2 λ k 2q 2 + + a m λ k mq m ) = c k λ k 1(a 1 q 1 + a 2 (λ 2 /λ 1 ) k q 2 + + a m (λ m /λ 1 ) k q m ) If λ 1 > λ 2 λ m 0 and q1 T v (0) 0, this gives ( v (k) (±q 1 ) = O λ 2 /λ 1 k), λ (k) λ 1 = O ( λ 2 /λ 1 2k) where ± sign is chosen to be sign of q T 1 v (k) It finds the largest eigenvalue (unless eigenvector is orthogonal to v (0) ) Error reduces by only a constant factor ( λ 2 /λ 1 ) each step, and very slowly especially when λ 2 λ 1 Xiangmin Jiao Numerical Analysis I 17 / 18

Rayleigh Quotient The Rayleigh quotient of x R m is the scalar r(x) = x T Ax x T x For an eigenvector x, its Rayleigh quotient is r(x) = x T λx/x T x = λ, the corresponding eigenvalue of x For general x, r(x) = α that minimizes Ax αx 2. x is eigenvector of A r(x) = 2 (Ax r(x)x) = 0 with x 0 x T x r(x) is smooth and r(q j ) = 0 for any j, and therefore is quadratically accurate: r(x) r(q J ) = O( x q J 2 ) as x q J for some J Xiangmin Jiao Numerical Analysis I 18 / 18