Computing Eigenvalues and/or Eigenvectors;Part 2, The Power method and QR-algorithm

Similar documents
Computing Eigenvalues and/or Eigenvectors;Part 2, The Power method and QR-algorithm

EIGENVALUE PROBLEMS. EIGENVALUE PROBLEMS p. 1/4

QR-decomposition. The QR-decomposition of an n k matrix A, k n, is an n n unitary matrix Q and an n k upper triangular matrix R for which A = QR

Numerical methods for eigenvalue problems

Orthogonal iteration to QR

Eigenvalues and eigenvectors

Computing Eigenvalues and/or Eigenvectors;Part 1, Generalities and symmetric matrices

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

ECS130 Scientific Computing Handout E February 13, 2017

Solving large scale eigenvalue problems

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Math 504 (Fall 2011) 1. (*) Consider the matrices

Solving large scale eigenvalue problems

Numerical Methods I Eigenvalue Problems

EIGENVALUE PROBLEMS (EVP)

Computing Eigenvalues and/or Eigenvectors;Part 1, Generalities and symmetric matrices

5.3 The Power Method Approximation of the Eigenvalue of Largest Module

Eigenvalue and Eigenvector Problems

EECS 275 Matrix Computation

Matrix Algorithms. Volume II: Eigensystems. G. W. Stewart H1HJ1L. University of Maryland College Park, Maryland

Lecture 2 INF-MAT : , LU, symmetric LU, Positve (semi)definite, Cholesky, Semi-Cholesky

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Eigenvalues and Eigenvectors

Linear algebra & Numerical Analysis

Index. for generalized eigenvalue problem, butterfly form, 211

Numerical Methods in Matrix Computations

EIGENVALUE PROBLEMS. Background on eigenvalues/ eigenvectors / decompositions. Perturbation analysis, condition numbers..

Matrices, Moments and Quadrature, cont d

Numerical Methods - Numerical Linear Algebra

Lecture # 11 The Power Method for Eigenvalues Part II. The power method find the largest (in magnitude) eigenvalue of. A R n n.

Orthogonal iteration to QR

Numerical Methods I: Eigenvalues and eigenvectors

Krylov subspace projection methods

Lecture Notes for Inf-Mat 3350/4350, Tom Lyche

Eigenvalue Problems. Eigenvalue problems occur in many areas of science and engineering, such as structural analysis

Scientific Computing: An Introductory Survey

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

Math 411 Preliminaries

1 Number Systems and Errors 1

Math 405: Numerical Methods for Differential Equations 2016 W1 Topics 10: Matrix Eigenvalues and the Symmetric QR Algorithm

Orthonormal Transformations and Least Squares

The German word eigen is cognate with the Old English word āgen, which became owen in Middle English and own in modern English.

The QR Algorithm. Marco Latini. February 26, 2004

11.3 Eigenvalues and Eigenvectors of a Tridiagonal Matrix

Lecture 2: Numerical linear algebra

Lecture 4 Basic Iterative Methods I

Introduction to Applied Linear Algebra with MATLAB

Orthogonal Transformations

Lecture 4 Eigenvalue problems

In this section again we shall assume that the matrix A is m m, real and symmetric.

The QR Decomposition

Computational Methods CMSC/AMSC/MAPL 460. EigenValue decomposition Singular Value Decomposition. Ramani Duraiswami, Dept. of Computer Science

CHAPTER 5. Basic Iterative Methods

Implementing the QR algorithm for efficiently computing matrix eigenvalues and eigenvectors

Orthonormal Transformations

Lecture 2 INF-MAT : A boundary value problem and an eigenvalue problem; Block Multiplication; Tridiagonal Systems

Section 4.5 Eigenvalues of Symmetric Tridiagonal Matrices

Numerical Analysis Lecture Notes

Double-shift QR steps

Eigenvalue Problems and Singular Value Decomposition

ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems

Lecture 1 INF-MAT3350/ : Some Tridiagonal Matrix Problems

DELFT UNIVERSITY OF TECHNOLOGY

Radial Basis Functions I

Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method

Chasing the Bulge. Sebastian Gant 5/19/ The Reduction to Hessenberg Form 3

ACM106a - Homework 4 Solutions

Linear Algebra, part 2 Eigenvalues, eigenvectors and least squares solutions

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Computational Methods. Eigenvalues and Singular Values

APPLIED NUMERICAL LINEAR ALGEBRA

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Data Analysis and Manifold Learning Lecture 2: Properties of Symmetric Matrices and Examples

The Singular Value Decomposition and Least Squares Problems

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Numerical Solution of Linear Eigenvalue Problems

13-2 Text: 28-30; AB: 1.3.3, 3.2.3, 3.4.2, 3.5, 3.6.2; GvL Eigen2

Least Squares. Tom Lyche. October 26, Centre of Mathematics for Applications, Department of Informatics, University of Oslo

Arnoldi Methods in SLEPc

MAA507, Power method, QR-method and sparse matrix representation.

Lecture 12 Eigenvalue Problem. Review of Eigenvalues Some properties Power method Shift method Inverse power method Deflation QR Method

Linear Algebra Methods for Data Mining

Algebra C Numerical Linear Algebra Sample Exam Problems

Lecture 1 INF-MAT : Chapter 2. Examples of Linear Systems

The Eigenvalue Problem: Perturbation Theory

Applied Linear Algebra

Direct methods for symmetric eigenvalue problems

AMS526: Numerical Analysis I (Numerical Linear Algebra)

MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators.

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

5 Selected Topics in Numerical Linear Algebra

Least squares and Eigenvalues

Eigenvalue problems. Eigenvalue problems

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Eigensystems for Large Matrices

Eigenpairs and Similarity Transformations

Krylov Subspace Methods to Calculate PageRank

632 CHAP. 11 EIGENVALUES AND EIGENVECTORS. QR Method

Next topics: Solving systems of linear equations

THE QR METHOD A = Q 1 R 1

Transcription:

Computing Eigenvalues and/or Eigenvectors;Part 2, The Power method and QR-algorithm Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo November 19, 2010

Today The power method to find the dominant eigenvector The shifted power method to speed up convergence The inverse power method The Rayleigh quotient iteration The QR-algorithm

The Power Method Find the eigenvector corresponding to the dominant (largest in absolute value) eigenvalue. With a simple modification we can also find the corresponding eigenvalue

Assumptions Let A C n,n have eigenpairs (λ j, v j ), j = 1,..., n. Given z 0 C n we assume that (i) λ 1 > λ 2 λ 3 λ n, (ii) z T 0 v 1 0. (iii) A has linearly independent eigenvectors. The first assumption means that A has a dominant eigenvalue λ 1 of algebraic multiplicity one. The second assumption says that z 0 has a component in the direction v 1. The third assumption is not necessary. It is included to simplify the analysis.

Powers Given A C n,n, a vector z 0 C n, and assume that i),ii), iii) hold. Define a sequence {z k } of vectors in C n by z k := A k z 0 = Az k 1, k = 1, 2,.... z 0 = c 1 v 1 + c 2 v 2 + + c n v n, with c 1 0. A k v j = λ k j v j, k = 0, 1, 2,..., j = 1,..., n. Then z k = c 1 A k v 1 + c 2 A k v 2 + + c n A k v n, k = 0, 1, 2,.... z k = c 1 λ k 1v 1 + c 2 λ k 2v 2 + + c n λ k nv n, k = 0, 1, 2,.... ( ) kv2 ( ) kvn λ = c 1 v 1 + c 2 λ 2 λ 1 + + c n n λ 1. z k λ k 1 z k /λ k 1, c 1 v 1, k

The Power method Need to normalize the vectors z k. Do not know λ 1. Choose a norm on C n, set x 0 = z 0 / z 0 and generate for k = 1, 2,... unit vectors,{x k } as follows: (i) (ii) y k = Ax k 1 x k = y k / y k. (1)

Example1 A = [ ] 1 2, z 3 4 0 = [1.0, 1.0] T, x 0 = [0.707, 0.707] T x 1 = [0.39, 0.92], x 2 = [0.4175, 0.9087], x 3 = [0.4159, 0.9094],... converges to an eigenvector of A

Example2 The way {x k } converges to an eigenvector can be more complicated. [ ] 1 2 A =, z 3 4 0 = [1.0, 1.0] T, x 0 = [0.707, 0.707] T x 1 = [ 0.39, 0.92], x 2 = [0.4175, 0.9087], x 3 = [ 0.4159, 0.9094],... changes sign in each iteration.

Convergence Lemma Suppose (i), (ii), (iii) hold. Then ( λ 1 ) kxk lim = c 1 k λ 1 c 1 v 1 v 1. In particular, if λ 1 > 0 and c 1 > 0 then the sequence {x k } will converge to the eigenvector u 1 := v 1 / v 1 of unit length. In Example 2, λ 1 < 0 and c 1 > 0 if u 1 = [0.4159, 0.9094] T. For k large: x k ( ) λ 1 k c1 u λ 1 c 1 1 = ( 1) k u 1.

Eigenvalue Suppose we know an approximate eigenvector of A C n,n. How should we estimate the corresponding eigenvalue? If (λ, u) is an exact eigenpair then Au λu = 0. If u is an approximate eigenvector we can minimize the function ρ : C R given by Theorem ρ(µ) := Au µu 2. ρ is minimized when µ = ν := u Au u u u. Proof on blackboard. is the Rayleigh quotient of

Power with Rayleigh function [l,x,it]=powerit(a,z,k,tol) af=norm(a, fro ); x=z/norm(z); for k=1:k y=a*x; l=x *y; if norm(y-l*x)/af < tol it=k; x=y/norm(y); return end x=y/norm(y); end it=k+1;

Example A 1 := [ ] 1 2, A 3 4 2 := [ ] 1.7 0.4, and A 0.15 2.2 3 = [ ] 1 2. 3 4 Start with a random vector and tol=10 6. Get convergence in 7 iterations for A 1, 174 iterations for A 2 and no convergence for A 3. A 3 has two complex eigenvalues so assumption i) is not satisfied Rate of convergence depends on r = λ 2 /λ 1. Faster convergence for smaller r. We have r 0.07 for A 1 and r = 0.95 for A 2.

The shifted power method A variant of the power method is the shifted power method. In this method we choose a number s and apply the power method to the matrix A si. The number s is called a shift since it shifts an eigenvalue λ of A to λ s of A si. Sometimes the convergence can be faster if the shift is chosen intelligently. For example, for A 2 with shift s = 1.8, we get convergence in 17 iterations instead of 174 without shift.

The inverse power method We apply the power method to the inverse matrix (A si) 1, where s is a shift. If A has eigenvalues λ 1,..., λ n in no particular order then (A si) 1 has eigenvalues µ 1 (s) = (λ 1 s) 1, µ 2 (s) = (λ 2 s) 1,..., µ n (s) = (λ n s) 1. Suppose λ 1 is a simple eigenvalue of A. Then lim s λ1 µ 1 (s) =, while lim s λ1 µ j (s) = (λ j λ 1 ) 1 < for j = 2,..., n. Hence, by choosing s sufficiently close to λ 1 the inverse power method will converge to that eigenvalue.

For the inverse power method (1) is replaced by. (i) (ii) (A si)y k = x k 1 x k = y k / y k. (2) Note that we solve the linear system rather than computing the inverse matrix. Normally the PLU-factorization of A si is pre-computed in order to speed up the iteration.

Rayleigh quotient iteration We can combine inverse power with Rayleigh quotient calculation. (i) (A s k 1 I)y k = x k 1, (ii) x k = y k / y k, (iii) s k = x kax k, (iv) r k = Ax k s k x k. We can avoid the calculation of Ax k in (iii) and (iv).

Example A 1 := [ ] 1 2. 3 4 Try to find the smallest eigenvalue λ = (5 33)/2 0.37 Start with x = [1, 1] T and s = 0 k 1 2 3 4 5 r 2 1.0e+000 7.7e-002 1.6e-004 8.2e-010 2.0e-020 s k λ 3.7e-001-1.2e-002-2.9e-005-1.4e-010-2.2e-016 Table: Quadratic convergence of Rayleigh quotient iteration

Problem with singularity? The linear system in i) becomes closer and closer to singular as s k converges to the eigenvalue. Thus the system becomes more and more ill-conditioned and we can expect large errors in the computed y k. This is indeed true, but we are lucky. Most of the error occurs in the direction of the eigenvector and this error disappears when we normalize y k in ii). Miraculously, the normalized eigenvector will be quite accurate.

Discussion Since the shift changes from iteration to iteration the computation of y will require O(n 3 ) flops for a full matrix. For such a matrix it might pay to reduce it to a upper Hessenberg form or tridiagonal form before starting the iteration. However, if we have a good approximation to an eigenpair then only a few iterations are necessary to obtain close to machine accuracy. If Rayleigh quotient iteration converges the convergence will be quadratic and sometimes even cubic.

The QR Algorithm An iterative method to compute all eigenvalues and eigenvectors of a matrix A C n,n. The matrix is reduced to triangular or quasitriangular form by a sequence of unitary similarity transformations computed from the QR factorization of A. Recall that for a square matrix the QR factorization and the QR decomposition are the same. If A = QR is a QR factorization then Q C n,n is unitary, Q Q = I and R C n,n is upper triangular.

Basic QR A 1 = A for k = 1, 2,... Q k R k = A k (QR factorization of A k ) A k+1 = R k Q k. end (3) The determination of the QR factorization of A k and the computation of R k Q k is called a QR step.

Example [ ] 2 1 A 1 = A = 1 2 A 1 = ( [ ] [ ] 5 1 2 1 ) ( 5 1 5 4 ) = Q1 R 1. 1 2 [ ] 5 4 A 2 = R 1 Q 1 = 1 5 [ ] 0 3 2.8 0.6. 0.6 1.2 [ ] 2.997 0.074 A 4, 0.074 1.0027 [ 3.0000 0.0001 A 10 0.0001 1.0000 0 3 ] [ 2 1 1 2 ]. = 1 5 [ ] 14 3 = 3 6 A 10 is almost diagonal and contains approximations to the eigenvalues λ 1 = 3 and λ 2 = 1 on the diagonal.

Example 2 we obtain A 14 = A 1 = A = 0.9501 0.8913 0.8214 0.9218 0.2311 0.7621 0.4447 0.7382 0.6068 0.4565 0.6154 0.1763 0.4860 0.0185 0.7919 0.4057 2.323 0.047223 0.39232 0.65056 2.1e 10 0.13029 0.36125 0.15946 4.1e 10 0.58622 0.052576 0.25774 1.2e 14 3.3e 05 1.1e 05 0.22746 A 14 is close to quasi-triangular..

Example 2 A 14 = 2.323 0.047223 0.39232 0.65056 2.1e 10 0.13029 0.36125 0.15946 4.1e 10 0.58622 0.052576 0.25774 1.2e 14 3.3e 05 1.1e 05 0.22746. The 1 1 blocks give us two real eigenvalues λ 1 2.323 and λ 4 0.2275. The middle 2 2 block has complex eigenvalues resulting in λ 2 0.0914 + 0.4586i and λ 3 0.0914 0.4586i. From Gerschgorin s circle theorem it follows that the approximations to the real eigenvalues are quite accurate. We would also expect the complex eigenvalues to have small absolute errors.

Why QR works Since Q ka k = R k we obtain A k+1 = R k Q k = Q ka k Q k. (4) Thus A k+1 is similar to A k and hence to A. It combines both the power method and the Rayleigh quotient iteration. If A R n,n has real eigenvalues, then under fairly general conditions, the sequence {A k } converges to an upper triangular matrix, the Schur form. If A is real, but with some complex eigenvalues, then the convergence will be to the quasi-triangular Schur form

QR factorization of A k Theorem For k = 1, 2, 3,..., the QR factorization of A k is A k = Q k R k, where Q k := Q 1 Q k and R k := R k R 1, (5) and Q 1,..., Q k, R 1,..., R k are the matrices generated by the basic QR algorithm (3). Moreover, A k = Q k 1A Q k 1, k 1. (6) Proof on blackboard.

Relation to the Power Method Since R k is upper triangular its first column is a multiple of e 1 A k e 1 = Q k Rk e 1 = r (k) 11 Q k e 1 or q (k) 1 := Q k e 1 = 1 r (k) 11 A k e 1. Since q (k) 1 2 = 1 the first column of Q k is the result of applying the normalized power iteration to the starting vector x 0 = e 1. If this iteration converges we conclude that the first column of Q k must converge to a dominant eigenvector of A.

Initial reduction to Hessenberg form One QR step requires O(n 3 ) flops for a matrix A of order n. By an initial reduction of A to upper Hessenberg form H 1, the cost of a QR step can be reduced to O(n 2 ).

Invariance of the Hessenberg form; I Consider a QR step on H 1. We determine plane rotations P i,i+1, i = 1,..., n 1 so that P n 1,n P 1,2 H 1 = R 1 is upper triangular. x x x x x x x x 0 x x x 0 0 x x P 1,2 P 3,4 x x x x 0 x x x 0 x x x 0 0 x x x x x x 0 x x x 0 0 x x 0 0 0 x P 2,3. x x x x 0 x x x 0 0 x x 0 0 x x

Invariance of the Hessenberg form; II H 1 = Q 1 R 1, where Q 1 = P 1,2 P n 1,n is a QR factorization of H 1. To finish the QR step we compute R 1 Q 1 = R 1 P 1,2 P n 1,n. This postmultiplication step is illustrated by the Wilkinson diagram ] [ P R 1 = x x x x ] [ 12 x x x x P x x x x ] 23 x x x x P 0 0 x x 34 0 x x x 0 0 0 x 0 0 0 x [ x x x x 0 x x x 0 0 x x 0 0 0 x [ x x x x ] x x x x 0 x x x. 0 0 x x

Invariance of the Hessenberg form; III If A k is upper Hessenberg then A k+1 is upper Hessenberg. One QR step requires O(n 2 ) flops. If A is tridiagonal and symmetric then one QR step requires O(n) flops.

Deflation If a subdiagonal element a i+1,i of an upper Hessenberg matrix A is equal to zero, then the eigenvalues of A are the union of the eigenvalues of the two smaller matrices A(1 : i, 1 : i) and A(i + 1 : n, i + 1 : n). Thus if during the iteration the (i + 1, i) element of A k is sufficiently small then we can continue the iteration on the two smaller submatrices separately.

Effect on A of deflation suppose a (k) i+1,i ɛ. Â k := A k a (k) i+1,i e i+1e T i A k = Q k 1A Q k 1 Â k = Q k 1(A + E) Q k 1, E = Q k 1 (a (k) i+1,i e i+1e T i ) Q k 1. E F = a (k) i+1,i e i+1e T i F = a (k) i+1,i ɛ Setting a (k) i+1,i = 0 amounts to a perturbation in the original A of at most ɛ.

The Shifted QR algorithms Like in the inverse power method it is possible to speed up the convergence by introducing shifts. The explicitly shifted QR algorithm works as follows: A 1 = A for k = 1, 2,... end Choose a shift s k Q k R k = A k s k I A k+1 = R k Q k + s k I. Since R k = Q k(a k s k I) (QR factorization of A k si) (7) A k+1 = Q k(a k s k I)Q k + s k I = Q ka k Q k and A k+1 and A k are unitary similar.

Relation to the inverse power method A s k I = Q k R k (A s k I) = R kq k (A s k I) Q k = R k (A s k I) Q k e n = R ke n = r (k) nn e n. Q k e n is the result of one iteration of the inverse power method to A with shift s k.

Choice of shifts 1. The shift s k := e T n A k e n is called the Rayleigh quotient shift. 2. The eigenvalue of the lower right 2 2 corner of A k closest to the n, n element of A k is called the Wilkinson shift. This shift can be used to find complex eigenvalues of a real matrix. 3. The convergence is very fast and at least quadratic both for the Rayleigh quotient shift and the Wilkinson shift.

The Implicitly shifted QR algorithm 1. By doing two QR iterations at a time it is possible to find both real and complex eigenvalues without using complex arithmetic. The corresponding algorithm is called the implicitly shifted QR algorithm 2. After having computed the eigenvalues we can compute the eigenvectors in steps. First we find the eigenvectors of the triangular or quasi-triangular matrix. We then compute the eigenvectors of the upper Hessenberg matrix and finally we get the eigenvectors of A. 3. Practical experience indicates that only O(n) iterations are needed to find all eigenvalues of A. Thus both the explicit- and implicit shift QR algorithms are normally O(n 3 ) algorithms.