Computational Methods CMSC/AMSC/MAPL 460 Eigenvalues and Eigenvectors Ramani Duraiswami, Dept. of Computer Science
Eigen Values of a Matrix Recap: A N N matrix A has an eigenvector x (non-zero) with corresponding eigenvalue if Ax= x This means Ax- x=0 (A-Ι) x=0 If a matrix vector product gives a zero vector, then either the vector is zero, or the matrix has zero determinant (is singular). Here this means det(a- I) =0
Left and Right Eigenvectors Right eigenvector of a matrix A is Ax= x For a N N matrix we can also define a left matrix product y t A=v t So if we have y t A= y t then y is a left eigenvector of A If A is symmetric A=A t (Ax) t =x t A t = x t A=( x) t = x t So left and right eigenvectors of a symmetric matrix are the same
Symmetric Matrices A matrix is symmetric if its transpose is equal to itself A is symmetric if A t =A For a complex matrix A H =A Eigenvalues and Eigenvectors of a real symmetric (complex hermitian) matrix are real and eigenvectors are orthogonal.
Symmetric Matrices Eigenvalues and Eigenvectors of a real symmetric matrix are real. Its eigenvectors are orthogonal. A X R = X R diag(... N) X L A = diag(... N) X L Multiply first equation on left by X L, second on the right by X R, and subtract (X L X R ) diag(... N) = diag(... N) (X L X R ) matrix of dot products of the left and right eigenvectors commutes with the diagonal matrix of eigenvalues. Only matrices that commute with a diagonal matrix of distinct elements are themselves diagonal. So X L X R is diagonal So X L and X R are orthogonal to each other But X L = X t R
Characteristic Equation Ax = x can be written as (A- I)x = 0 which holds for x 0, so (A- I) is singular and det(a- I) = 0 This is called the characteristic polynomial. If A is n n the polynomial is of degree n and so A has n eigenvalues, counting multiplicities.
Example = 2 3 4 A = 2 3 4 I A 0 ()(3) ) )(2 (4 0 ) det( = = I A 0 ) 5)( ( 0 5 6 2 = = + Hence the two eigenvalues are and 5.
Example (continued) Once we have the eigenvalues, the eigenvectors can be obtained by substituting back into (A- I)x = 0. This gives eigenvectors ( -) T and ( /3) T Note that we can scale the eigenvectors any way we want. Determinant are not used for finding the eigenvalues of large matrices.
Positive Definite Matrices A complex matrix A is positive definite if for every nonzero complex vector x the quadratic form x H Ax is real and: x H Ax > 0 where x H denotes the conjugate transpose of x (i.e., change the sign of the imaginary part of each component of x and then transpose).
Eigenvalues of Positive Definite Matrices If A is positive definite and and x are an eigenvalue/eigenvector pair, then: Ax = x x H Ax = x H x Since x H Ax and x H x are both real and positive it follows that is real and positive.
Properties of Positive Definite Matrices If A is a positive definite matrix then: A is nonsingular. The inverse of A is positive definite. Gaussian elimination can be performed on A without pivoting. The eigenvalues of A are positive.
Hermitian Matrices A square matrix for which A = A H is said to be an Hermitian matrix. If A is real and Hermitian it is said to be symmetric, and A = A T. Every Hermitian matrix is positive definite. Every eigenvalue of an Hermitian matrix is real. Different eigenvectors of an Hermitian matrix are orthogonal to each other, i.e., their scalar product is zero.
The Power Method Label the eigenvalues in order of decreasing absolute value so > 2 > n. Consider the iteration formula: y k+ = Ay k where we start with some initial y 0, so that: y k = A k y 0 Then y k converges to the eigenvector x corresponding the eigenvalue.
Eigen Decomposition Recall from previous class the Eigen decomposition Let, 2,, n be the eigenvalues of the n n matrix A and x,x 2,,x n the corresponding eigenvectors. Let Λ be the diagonal matrix with, 2,, n on the main diagonal. Let X be the n n matrix whose jth column is x j. Then AX = X Λ, and so we have the eigen decomposition of A: A = X t Λ X - This requires X to be invertible, thus the eigenvectors of A must be linearly independent.
Powers of Matrices Also recall If A = X t Λ X - then: A 2 = (X t Λ X - )(X t Λ X - ) = X t Λ (X - X) Λ X - = X t Λ 2 X - Hence we have: A p = X t Λ p X - Thus, A p has the same eigenvectors as A, and its eigenvalues are p, 2p,, np. We can use these results as the basis of an iterative algorithm for finding the eigenvalues of a matrix.
Proof We know that A k = X Λ k X -, so: y k = A k y 0 = X Λ k X - y 0 Now we have: = = Λ k k n k k k k n k k k 2 2 O O The terms on the diagonal get smaller in absolute value as k increases, since is the dominant eigenvalue.
Proof (continued) So we have 2 0 0 c x c c c x x y k n n k k = = M O M M L M M Since k c x is just a constant times x then we have the required result.
Example Let A = [2-2; -5] and y 0 =[ ] y = -4[2.50.00] y 2 = 0[2.80.00] y 3 = -22[2.9.00] y 4 = 46[2.96.00] y 5 = -94[2.98.00] y 6 = -90[2.99.00] The iteration is converging on a scalar multiple of [3 ], which is the correct dominant eigenvector.
Rayleigh Quotient Note that once we have the eigenvector, the corresponding eigenvalue can be obtained from the Rayleigh quotient: dot(ax,x)/dot(x,x) where dot(a,b) is the scalar product of vectors a and b defined by: dot(a,b) = a b +a 2 b 2 + +a n b n So for our example, = -2.
Scaling The k can cause problems as it may become very large as the iteration progresses. To avoid this problem we scale the iteration formula: y k+ = A(y k /r k+ ) where r k+ is the component of Ay k with largest absolute value.
Example with Scaling Let A = [2-2; -5] and y 0 =[ ] Ay 0 = [-0-4] so r =-0 and y =[.00 0.40]. Ay = [-2.8 -.0] so r 2 =-2.8 and y 2 =[.0 0.357]. Ay 2 = [-2.2857-0.7857] so r 3 =-2.2857 and y 3 =[.0 0.3437]. Ay 3 = [-2.250-0.787] so r 4 =-2.250 and y 4 =[.0 0.3382]. Ay 4 = [-2.0588-0.692] so r 5 =-2.0588 and y 5 =[.0 0.3357]. Ay 5 = [-2.0286-0.6786] so r 6 =-2.0286 and y 6 =[.0 0.3345]. r is converging to the correct eigenvalue -2.
Scaling Factor At step k+, the scaling factor r k+ is the component with largest absolute value is Ay k. When k is sufficiently large Ay k ' y k. The component with largest absolute value in y k is (since y k was scaled in the previous step to have largest component ). Hence, r k+ -> as k ->.
MATLAB Code function [lambda,y]=powermethod(a,y,n) for (i=:n) y = A*y; [c j] = max(abs(y)); lambda = y(j); y = y/lambda; end
Convergence The Power Method relies on us being able to ignore terms of the form ( j / ) k when k is large enough. Thus, the convergence of the Power Method depends on 2 /. If 2 / = the method will not converge. If 2 / is close to the method will converge slowly.