Linear algebra & Numerical Analysis

Linear algebra & Numerical Analysis Eigenvalues and Eigenvectors Marta Jarošová http://homel.vsb.cz/~dom033/

Outline Methods computing all eigenvalues Characteristic polynomial Jacobi method for symmetric matrices QR iteration Shifts preliminary reduction Methods computing selected eigenvalues Power method normalization Shifts Rayleigh quotient Matlab functions

Methods computing all eigenvalues Characteristic polynomial Jacobi method for symmetric matrices QR iteration Shifts preliminary reduction

Characteristic polynomial det(a-λi) = 0 This method is not recommended as a general numerical procedure The coefficients are not well-determined numerically the roots can be sensitive to perturbations in the coefficients Root finding of a polynomial of high degree requires great deal of work the problems are equivalent in theory, but in practice the solution is not preserved numerically; and computing the roots of the polynomial is no simpler that the original eigenvalue problem

Observation Important theoretical observations: Abel (1824): the roots of a polynomial of degree greater than 4 cannot always be expressed by a closed-form formula in the coefficients using ordinary arithmetic operations and root extractors => computing the eigenvalues of matrices of the order greater than 4 requires a (theoretically infinite) iterative process

Jacobi method for symmetric matrices A 0 = A Iteration: A k+1 = J k T A k J k a plane rotation J k is chosen to annihilate a symmetric pair of entries in the matrix A k (symmetry is preserved) Recall: plane rotation c s -s c With c and s the cosine and sine of the angle of rotation

Jacobi method for symmetric matrices Choice of c and s slightly more complicated than in Givens QR method we are annihilating a symmetric pair of matrix entries by a similarity transformations (Givens QR: single entry by a one-sided transformation) with b 0 (else diagonal). The transformed matrix is diagonal if and

Jacobi method for symmetric matrices

Jacobi method for symmetric matrices Plate rotations are repeatedly applied from both sides of the matrix until the off-diagonal entries are reduced to zero (with some tolerance) Resulting approximately diagonal matrix is orthogonally similar to the original matrix => we have the approximation of eigenvalues on the diagonal and the product of all plane rotations gives the eigenvectors

PROS: Jacobi method for symmetric matrices easy to implement (also in parallel) high accuracy CONS: slow convergence only for symmetric matrices Main source of inefficiency: annihilated elements can become nonzero again

Example: Jacobi method for symmetric matrices Annihilate entries (1,3) and (3,1) using plane rotation Next annihilate entries (1,2) and (2,1) using plane rotation

Jacobi method for symmetric matrices (example) Next annihilate entries (2,3) and (3,2) using plane rotation Beginning a new sweep, we again annihilate entries (1, 3) and (3,1) 1) using the plane rotation

QR iteration makes repeated use of the QR factorization to produce an orthogonal similarity transformation of the matrix to diagonal or triangular form initialization QR factorization in the k-th step reverse product T Since A k are orthogonaly similar to each other

QR iteration For all distinct eigenvalues A k converge to triangular form for general initial matrix, or diagonal form for a symmetric initial matrix

QR iteration off-diagonal entries are smaller, so the matrix is closer to being triangular and the diagonal entries are now closer to the eigenvalues, which are 8 and 3. repeat this process until the convergence the diagonal entries would then closely approximate the eigenvalues, the product of the orthogonal matrices Q k would yield the corresponding eigenvectors

QR iteration with shifts The convergence rate of QR iteration can be accelerated by introducing shifts: Choice of shift: lower right corner entry better shift: computing eigenvalues of the 2x2 submatrix in the lower right corner of the matrix

QR iteration with shifts Example: Compared with the unshifted algorithm, the off-diagonal entries are smaller after one iteration, and the diagonal entries are closer approximations to the eigenvalues. For the next iteration, we would use the new value of the lower right corner entry as the shift.

Preliminary reduction Each iteration of the QR method requires O(n 3 ) operations This can be reduced if the matrix is initially transformed into a simpler form It is advantageous if the matrix is as close as possible to triangular or diagonal for symmetric matrix before the QR iterations begin

Preliminary reduction Hessenberg matrix is triangular except one additional nonzero diagonal immediately adjacent to the main diagonal Note: a symmetric Hessenberg matrix is tridiagonal

Preliminary reduction Any matrix can be reduced to Hessenberg form in a finite number of steps by an orthogonal similarity transformations (e.g. using Householder rotations) Upper Hessenberg or tridiagonal form can then be preserved during the subsequent QR iterations Advantages: Work per QR iteration is reduced to at most O(n 2 ) The convergence rate of the QR iterations is enhanced If there are any zero entries on the first subdiagonal, then the problem can be broken into two or more smaller subproblems

QR iteration with preliminary reduction Two-stage process of QR method: symmetric tridiagonal diagonal general Hessenberg triangular Preliminary reduction requires finite number of steps, whereas the subsequent iterative stage continues until convergence (in practice, only modest number of iterations is usually required) => O(n 3 ) cost of preliminary reduction is significant factor in total

QR iteration with preliminary reduction Total cost: depends on if the eigenvectors are needed because their inclusion determines whether the orthogonal transformations must be accumulated symmetric case: overall cost is roughly 4/3 n 3 operations if only eigenvectors are needed and about 9 n 3 operations if the eigenvectors are also desired general case: 9 n 3 (only eigenvalues) and about 25 n 3 (also eigenvectors)

Methods computing selected eigenvalues Power method normalization Shifts Rayleigh quotient

Power method The simplest method for computing a single eigenvalue and eigenvector Takes sufficiently higher powers of the matrix times an initial starting vector Assume that the matrix has a unique eigenvalue λ 1 of maximum modulus (dominate eigenvalue) with corresponding eigenvector u 1 For given x 0, iterations converge to u 1.

Power method

normalization Power method with normalization

Power method with normalization

Power method with shifts Convergence rate: depends on the ratio the smaller this ratio, the faster the convergence It may be possible to choose a shift,, such that and thus convergence is accelerated the shift must then be added to the result to obtain the eigenvalue of the original matrix

Rayleigh quotient given approximate eigenvector x for a real matrix A, the best estimate for the corresponding eigenvalue can be considered as an n x 1 linear least squares approximation problem with normal equation the least squares solution is given by

Power method and Rayleigh quotient

Matlab functions E = eig(x) is a vector containing the eigenvalues of a square matrix X. [V,D] = eig(x) produces a diagonal matrix D of eigenvalues and a full matrix V whose columns are the corresponding eigenvectors so that X*V = V*D. D = eigs(a) returns a vector of A's 6 largest magnitude eigenvalues. A must be square and should be large and sparse. [V,D] = eigs(a) returns a diagonal matrix D of A's 6 largest magnitude eigenvalues and a matrix V whose columns are the corresponding eigenvectors.

References Michael T. Heath, Scientific computing: an introductory survey, McGraw-Hill, 2002 Lloyd Nicholas Trefethen, David Bau, Numerical linear algebra (available on Google books)