Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Similar documents
Today: eigenvalue sensitivity, eigenvalue algorithms Reminder: midterm starts today

Orthogonal iteration to QR

MATH 304 Linear Algebra Lecture 20: The Gram-Schmidt process (continued). Eigenvalues and eigenvectors.

MATH 583A REVIEW SESSION #1

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

Numerical Methods I Eigenvalue Problems

Linear Algebra Massoud Malek

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

EIGENVALUE PROBLEMS. Background on eigenvalues/ eigenvectors / decompositions. Perturbation analysis, condition numbers..

AMS526: Numerical Analysis I (Numerical Linear Algebra)

ECS130 Scientific Computing Handout E February 13, 2017

Eigenvalues and eigenvectors

DIAGONALIZATION. In order to see the implications of this definition, let us consider the following example Example 1. Consider the matrix

The Eigenvalue Problem: Perturbation Theory

Matrices, Moments and Quadrature, cont d

Eigenproblems II: Computation

In this section again we shall assume that the matrix A is m m, real and symmetric.

Krylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17

THE EIGENVALUE PROBLEM

QR-decomposition. The QR-decomposition of an n k matrix A, k n, is an n n unitary matrix Q and an n k upper triangular matrix R for which A = QR

MATH 221, Spring Homework 10 Solutions

5.3 The Power Method Approximation of the Eigenvalue of Largest Module

Computational Methods CMSC/AMSC/MAPL 460. Eigenvalues and Eigenvectors. Ramani Duraiswami, Dept. of Computer Science

Lecture 2: Numerical linear algebra

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

Definition (T -invariant subspace) Example. Example

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson

ft-uiowa-math2550 Assignment NOTRequiredJustHWformatOfQuizReviewForExam3part2 due 12/31/2014 at 07:10pm CST

arxiv: v1 [math.na] 5 May 2011

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems

Contents. Preface for the Instructor. Preface for the Student. xvii. Acknowledgments. 1 Vector Spaces 1 1.A R n and C n 2

Numerical Methods - Numerical Linear Algebra

Singular Value Decomposition

AMS526: Numerical Analysis I (Numerical Linear Algebra)

7. Symmetric Matrices and Quadratic Forms

Computational Methods. Eigenvalues and Singular Values

235 Final exam review questions

EIGENVALUE PROBLEMS. EIGENVALUE PROBLEMS p. 1/4

1. What is the determinant of the following matrix? a 1 a 2 4a 3 2a 2 b 1 b 2 4b 3 2b c 1. = 4, then det

BASIC ALGORITHMS IN LINEAR ALGEBRA. Matrices and Applications of Gaussian Elimination. A 2 x. A T m x. A 1 x A T 1. A m x

Eigenvalues, Eigenvectors, and Diagonalization

Orthogonal iteration to QR

AMS526: Numerical Analysis I (Numerical Linear Algebra)

MATH 20F: LINEAR ALGEBRA LECTURE B00 (T. KEMP)

MATH 5640: Functions of Diagonalizable Matrices

EECS 275 Matrix Computation

Orthogonal iteration revisited

Scientific Computing: An Introductory Survey

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

Eigenvalue and Eigenvector Problems

5.) For each of the given sets of vectors, determine whether or not the set spans R 3. Give reasons for your answers.

18.06SC Final Exam Solutions

Math 504 (Fall 2011) 1. (*) Consider the matrices

Announce Statistical Motivation Properties Spectral Theorem Other ODE Theory Spectral Embedding. Eigenproblems I

Numerical Analysis Lecture Notes

33AH, WINTER 2018: STUDY GUIDE FOR FINAL EXAM

Announce Statistical Motivation ODE Theory Spectral Embedding Properties Spectral Theorem Other. Eigenproblems I

Eigenvalues and Eigenvectors

Linear algebra and applications to graphs Part 1

AMS526: Numerical Analysis I (Numerical Linear Algebra)

1 Extrapolation: A Hint of Things to Come

AMS526: Numerical Analysis I (Numerical Linear Algebra)

18.06 Problem Set 8 - Solutions Due Wednesday, 14 November 2007 at 4 pm in

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

Tutorials in Optimization. Richard Socher

Notes on Eigenvalues, Singular Values and QR

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Iterative Methods for Linear Systems of Equations

NORMS ON SPACE OF MATRICES

Chapter 7. Canonical Forms. 7.1 Eigenvalues and Eigenvectors

Geometric Modeling Summer Semester 2010 Mathematical Tools (1)

Exercise Set 7.2. Skills

Mathematical foundations - linear algebra

Chapter 3 Transformations

IMPORTANT DEFINITIONS AND THEOREMS REFERENCE SHEET

Review problems for MA 54, Fall 2004.

Linear Algebra- Final Exam Review

Cheat Sheet for MATH461

Large-scale eigenvalue problems

A Residual Inverse Power Method

Computing Eigenvalues and/or Eigenvectors;Part 2, The Power method and QR-algorithm

Linear Algebra. Session 12

Synopsis of Numerical Linear Algebra

Lecture Note 13: Eigenvalue Problem for Symmetric Matrices

Numerical Solution of Linear Eigenvalue Problems

The Lanczos and conjugate gradient algorithms

Numerical Algorithms. IE 496 Lecture 20

Ir O D = D = ( ) Section 2.6 Example 1. (Bottom of page 119) dim(v ) = dim(l(v, W )) = dim(v ) dim(f ) = dim(v )

Eigenvalue Problems. Eigenvalue problems occur in many areas of science and engineering, such as structural analysis

Linear Algebra, part 3. Going back to least squares. Mathematical Models, Analysis and Simulation = 0. a T 1 e. a T n e. Anna-Karin Tornberg

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

APPLIED NUMERICAL LINEAR ALGEBRA

Chapter 7: Symmetric Matrices and Quadratic Forms

I. Multiple Choice Questions (Answer any eight)

MIT Final Exam Solutions, Spring 2017

Math 108b: Notes on the Spectral Theorem

Linear algebra & Numerical Analysis

Computing Eigenvalues and/or Eigenvectors;Part 2, The Power method and QR-algorithm

Transcription:

1 Power iteration Notes for 2016-10-17 In most introductory linear algebra classes, one computes eigenvalues as roots of a characteristic polynomial. For most problems, this is a bad idea: the roots of the characteristic polynomial are often very sensitive to changes in the polynomial coefficients even when they correspond to well-conditioned eigenvalues. Rather than starting from this point, we will start with another idea: the power iteration. Suppose A C n n is diagonalizable, with eigenvalues λ 1,..., λ n ordered so that λ 1 λ 2... λ n. Then we have A = V ΛV 1 where Λ = diag(λ 1,..., λ n ). Now, note that or, to put it differently, A k = (V ΛV 1 )(V ΛV 1 )... (V ΛV 1 ) = V Λ k V 1, A k V = V Λ k. Now, suppose we have a randomly chosen vector x = V x C n, and consider A k x = A k V x = V Λ k x n = v j λ k j x j. If we pull out a constant factor from this expression, we have ( n ( ) ) k A k x = λ k λj 1 v j x j. λ 1 j=1 If λ 1 > λ 2, then (λ j /λ 1 ) k 0 for each j > 1, and for large enough k, we expect A k x to be nearly parallel to v 1, assuming x 1 0. This is the idea behind the power iteration: x (k+1) = j=1 Ax(k) Ax (k) = Ak x (0) A k x (0). Assuming that the first component of V 1 x (0) is nonzero and that λ 1 > λ 2, the iterates x (k) converge linearly to the dominant eigenvector of A, with the error asymptotically decreasing by a factor of λ 1 / λ 2 at each step. There are three obvious potential problems with the power method:

1. What if the first component of V 1 x (0) is zero? 2. What λ 1 /λ 2 is near one? 3. What if we want the eigenpair (λ j, v j ) for j 1? The first point turns out to be a non-issue: if we choose x (0) at random, then the first component of V 1 x (0) will be nonzero with probability 1. Even if we were so extraordinarily unlucky as to choose a starting vector for which V 1 x (0) did have a zero leading coefficient, perturbations due to floating point arithmetic would generally bump us to the case in which we had a nonzero coefficient. The second and third points turn out to be more interesting, and we address them now. 2 Spectral transformation and shift-invert Suppose again that A is diagonalizable with A = V ΛV 1. The power iteration relies on the identity A k = V Λ k V 1. Now, suppose that f(z) is any function that is defined locally by a convergent power series. Then as long as the eigenvalues are within the radius of convergence, we can define f(a) via the same power series, and f(a) = V f(λ)v 1 where f(λ) = diag(f(λ 1 ), f(λ 2 ),..., f(λ n )). So the spectrum of f(a) is the image of the spectrum of A under the mapping f, a fact known as the spectral mapping theorem. As a particular instance, consider the function f(z) = (z σ) 1. This gives us (A σi) 1 = V (Λ σi) 1 V 1, and so if we run power iteration on (A σi) 1, we will converge to the eigenvector corresponding to the eigenvalue λ j for which (λ j σ) 1 is maximal that is, we find the eigenvalue closest to σ in the complex plane. Running the power method on (A σi) 1 is sometimes called the shift-invert power method.

3 Changing shifts If we know a shift σ that is close to a desired eigenvalue, the shift-invert power method may be a reasonable method. But even with a good choice of shift, this method converges at best linearly (i.e. the error goes down by a constant factor at each step). We can do better by choosing a shift dynamically, so that as we improve the eigenvector, we also get a more accurate shift. Suppose ˆv is an approximate eigenvector for A, i.e. we can find some ˆλ so that (1) Aˆv ˆvˆλ 0. The choice of corresponding approximate eigenvalues is not so clear, but a reasonable choice (which is always well-defined when ˆv is nonzero) comes from multiplying (1) by ˆv and changing the to an equal sign: ˆv Aˆv ˆv ˆvˆλ = 0. The resulting eigenvalue approximation ˆλ is the Rayleigh quotient: ˆλ = ˆv Aˆv ˆv ˆv. If we dynamically choose shifts for shift-invert steps using Rayleigh quotients, we get the Rayleigh quotient iteration: λ k+1 = v(k) Av (k) v (k) v (k) v (k+1) = (A λ k+1) 1 v (k) (A λ k+1 ) 1 v (k) 2 Unlike the power method, the Rayleigh quotient iteration has locally quadratic convergence so once convergence sets in, the number of correct digits roughly doubles from step to step. We will return to this method later when we discuss symmetric matrices, for which the Rayleigh quotient iteration has locally cubic convergence. 4 Subspaces and orthogonal iteration So far, we have still not really addressed the issue of dealing with clustered eigenvalues. For example, in power iteration, what should we do if λ 1 and

λ 2 are very close? If the ratio between the two eigenvalues is nearly one, we don t expect the power method to converge quickly; and we are likely to not have at hand a shift which is much closer to λ 1 than to λ 2, so shift-invert power iteration might not help much. In this case, we might want to relax our question, and look for the invariant subspace associated with λ 1 and λ 2 (and maybe more eigenvalues if there are more of them clustered together with λ 1 ) rather than looking for the eigenvector associated with λ 1. This is the idea behind subspace iteration. In subspace iteration, rather than looking at A k x 0 for some initial vector x 0, we look at V k = A k V 0, where V 0 is some initial subspace. If V 0 is a p-dimensional space, then under some mild assumptions the space V k will asymptotically converge to the p-dimensional invariant subspace of A associated with the p eigenvalues of A with largest modulus. The analysis is basically the same as the analysis for the power method. In order to actually compute, though, we need bases for the subspaces V k. Let us define these bases by the recurrence Q k+1 R k+1 = AQ k where Q 0 is a matrix with p orthonormal columns and Q k+1 R k+1 represents an economy QR decomposition. This recurrence is called orthogonal iteration, since the columns of Q k+1 are an orthonormal basis for the range space of AQ k, and the span of Q k is the span of A k Q 0. Assuming there is a gap between λ p and λ p+1, orthogonal iteration will usually converge to an orthonormal basis for the invariant subspace spanned by the first p eigenvectors of A. But it is interesting to look not only at the behavior of the subspace, but also at the span of the individual eigenvectors. For example, notice that the first column q k,1 of Q k satisfies the recurrence q k+1,1 r k+1,11 = Aq k,1, which means that the vectors q k,1 evolve according to the power method! So over time, we expect the first columns of the Q k to converge to the dominant eigenvector. Similarly, we expect the first two columns of Q k to converge to a basis for the dominant two-dimensional invariant subspace, the first three columns to converge to the dominant three-dimensional invariant subspace, and so on. This observation suggests that we might be able to get a complete list of nested invariant subspaces by letting the initial Q 0 be some square matrix. This is the basis for the workhorse of nonsymmetric eigenvalue algorithms, the QR method, to which we shall turn next time.

5 Codes 5.1 Basic power iteration 1 % [v,lambda] = power(a, v, maxiter, rtol) 2 % 3 % Run power iteration to compute the dominant eigenvalue of A and 4 % an associated eigenvector. This will fail in general if there are 5 % multiple dominant eigenvalues (e.g. from a complex conjugate pair). 6 % 7 % Inputs: 8 % A: Matrix to be analyzed 9 % v: Start vector (default random) 10 % maxiter: Maximum number of iterations allowed (default 1000) 11 % rtol: Rel residual tolerance for convergence (default 1e-3) 12 % 13 function [v,lambda] = power(a, v, maxiter, rtol) 14 15 % Fill in default parameters 16 if nargin < 2, v = []; end 17 if nargin < 3, maxiter = 1000; end 18 if nargin < 4, rtol = 1e-3; end 19 20 % Start with a random vector by default 21 if isempty(v) 22 v = randn(length(a),1); 23 end 24 v = v / norm(v); 25 26 % Get estimate of 2-norm of A for convergence test 27 normaest = sqrt(norm(a,1) * norm(a,inf)); 28 29 % Run the iteration 30 Av = A*v; 31 for k = 1:maxiter 32 33 % Take a power method step and compute Rayleigh quotient 34 v = Av/norm(Av); 35 Av = A*v; 36 lambda = v *Av; 37 38 % Compute the residual and check vs tolerance 39 r = Av-v*lambda; 40 normr = norm(r); 41 if normr < rtol*normaest,

42 return; 43 end 44 45 end 46 47 % If we get here, give a warning 48 warning(... 49 sprintf( Power did not converge (rel resid %e after %d steps),... 50 normr/normaest, maxiter)); 51 52 end 5.2 Rayleigh quotient iteration 1 % [v,lambda] = rqi(a, sigma, v, maxiter, rtol) 2 % 3 % Run Rayleigh quotient iteration to compute an eigenpair of A. 4 % 5 % Inputs: 6 % A: Matrix to be analyzed 7 % sigma: Initial shift 8 % v: Start vector (default random) 9 % maxiter: Maximum number of iterations allowed (default 1000) 10 % rtol: Rel residual tolerance for convergence (default 1e-8) 11 % 12 function [v,lambda] = rqi(a, lambda, v, maxiter, rtol) 13 14 % Fill in default parameters 15 if nargin < 2, lambda = 0; end 16 if nargin < 3, v = []; end 17 if nargin < 4, maxiter = 1000; end 18 if nargin < 5, rtol = 1e-8; end 19 20 % Start with a random vector by default 21 if isempty(v) 22 v = randn(length(a),1); 23 end 24 v = v / norm(v); 25 26 % Get estimate of 2-norm of A for convergence test 27 normaest = sqrt(norm(a,1) * norm(a,inf)); 28 I = eye(length(v)); 29 30 % Run the iteration 31 for k = 1:maxiter

32 33 % Take a power method step and compute Rayleigh quotient 34 v = (A-lambda*I)\v; 35 v = v/norm(v); 36 Av = A*v; 37 lambda = v *Av; 38 39 % Compute the residual and check vs tolerance 40 r = Av-v*lambda; 41 normr = norm(r); 42 if normr < rtol*normaest, 43 return; 44 end 45 46 end 47 48 % If we get here, give a warning 49 warning(... 50 sprintf( RQI did not converge (rel resid %e after %d steps),... 51 normr/normaest, maxiter)); 52 53 end 5.3 Subspace iteration 1 % [v,lambda] = subspace(a, k, V, maxiter, rtol) 2 % 3 % Orthogonal iteration to compute the dominant invariant subspace of A. 4 % 5 % Inputs: 6 % A: Matrix to be analyzed 7 % k: Dimension of subspace 8 % v: Start vector (default random) 9 % maxiter: Maximum number of iterations allowed (default 1000) 10 % rtol: Rel residual tolerance for convergence (default 1e-4) 11 % 12 function [V,L] = subspace(a, k, V, maxiter, rtol) 13 14 % Fill in default parameters 15 if nargin < 2, k = 1; end 16 if nargin < 3, V = []; end 17 if nargin < 4, maxiter = 1000; end 18 if nargin < 5, rtol = 1e-4; end 19 20 % Start with a random vector by default

21 if isempty(v) 22 V = randn(length(a),k); 23 end 24 V = V / norm(v); 25 26 % Get estimate of 2-norm of A for convergence test 27 normaest = sqrt(norm(a,1) * norm(a,inf)); 28 29 % Run the iteration 30 AV = A*V; 31 for k = 1:maxiter 32 33 % Take a power method step and compute Rayleigh quotient 34 [V,R] = qr(av,0); 35 AV = A*V; 36 L = V *AV; 37 38 % Compute the residual and check vs tolerance 39 R = AV-V*L; 40 normr = norm(r, fro ); 41 if normr < rtol*normaest, 42 return; 43 end 44 45 end 46 47 % If we get here, give a warning 48 warning(... 49 sprintf( Power did not converge (rel resid %e after %d steps),... 50 normr/normaest, maxiter)); 51 52 end