Orthogonal iteration to QR

Similar documents
Orthogonal iteration revisited

Orthogonal iteration to QR

Matrices, Moments and Quadrature, cont d

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

Numerical Analysis Lecture Notes

Double-shift QR steps

Solving large scale eigenvalue problems

Eigenvalue Problems and Singular Value Decomposition

Eigenvalues and eigenvectors

AMS526: Numerical Analysis I (Numerical Linear Algebra)

13-2 Text: 28-30; AB: 1.3.3, 3.2.3, 3.4.2, 3.5, 3.6.2; GvL Eigen2

QR-decomposition. The QR-decomposition of an n k matrix A, k n, is an n n unitary matrix Q and an n k upper triangular matrix R for which A = QR

Lecture # 11 The Power Method for Eigenvalues Part II. The power method find the largest (in magnitude) eigenvalue of. A R n n.

Section 4.5 Eigenvalues of Symmetric Tridiagonal Matrices

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

EIGENVALUE PROBLEMS. EIGENVALUE PROBLEMS p. 1/4

MAA507, Power method, QR-method and sparse matrix representation.

Linear algebra & Numerical Analysis

Solving large scale eigenvalue problems

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

ECS130 Scientific Computing Handout E February 13, 2017

THE QR METHOD A = Q 1 R 1

Lecture 4 Eigenvalue problems

Computing Eigenvalues and/or Eigenvectors;Part 2, The Power method and QR-algorithm

Computing Eigenvalues and/or Eigenvectors;Part 2, The Power method and QR-algorithm

EECS 275 Matrix Computation

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Math 504 (Fall 2011) 1. (*) Consider the matrices

Eigenvalue Problems. Eigenvalue problems occur in many areas of science and engineering, such as structural analysis

Scientific Computing: An Introductory Survey

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Notes on Eigenvalues, Singular Values and QR

AMS526: Numerical Analysis I (Numerical Linear Algebra)

EIGENVALUE PROBLEMS. Background on eigenvalues/ eigenvectors / decompositions. Perturbation analysis, condition numbers..

Chasing the Bulge. Sebastian Gant 5/19/ The Reduction to Hessenberg Form 3

Lecture 4: Applications of Orthogonality: QR Decompositions

We will discuss matrix diagonalization algorithms in Numerical Recipes in the context of the eigenvalue problem in quantum mechanics, m A n = λ m

Computational Methods. Eigenvalues and Singular Values

1 Last time: least-squares problems

Eigenvalues and Eigenvectors

In this section again we shall assume that the matrix A is m m, real and symmetric.

Eigenvalue and Eigenvector Problems

Index. for generalized eigenvalue problem, butterfly form, 211

18.06SC Final Exam Solutions

Numerical Methods I Eigenvalue Problems

ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems

Numerical Methods - Numerical Linear Algebra

I. Multiple Choice Questions (Answer any eight)

Lecture 6: Lies, Inner Product Spaces, and Symmetric Matrices

arxiv: v1 [math.na] 5 May 2011

Worksheet for Lecture 25 Section 6.4 Gram-Schmidt Process

MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators.

Linear Algebra Primer

EXAM. Exam 1. Math 5316, Fall December 2, 2012

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Math 405: Numerical Methods for Differential Equations 2016 W1 Topics 10: Matrix Eigenvalues and the Symmetric QR Algorithm

BASIC ALGORITHMS IN LINEAR ALGEBRA. Matrices and Applications of Gaussian Elimination. A 2 x. A T m x. A 1 x A T 1. A m x

Computational Methods CMSC/AMSC/MAPL 460. Eigenvalues and Eigenvectors. Ramani Duraiswami, Dept. of Computer Science

Algebra C Numerical Linear Algebra Sample Exam Problems

Fundamentals of Engineering Analysis (650163)

Bindel, Spring 2016 Numerical Analysis (CS 4220) Notes for

Numerical Linear Algebra

6.4 Krylov Subspaces and Conjugate Gradients

Exercise Set 7.2. Skills

5.3 The Power Method Approximation of the Eigenvalue of Largest Module

Properties of Matrices and Operations on Matrices

Numerical methods for eigenvalue problems

Schur s Triangularization Theorem. Math 422

Least squares and Eigenvalues

Matrix Algebra: Summary

Problem # Max points possible Actual score Total 120

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Matrix Theory, Math6304 Lecture Notes from September 27, 2012 taken by Tasadduk Chowdhury

Course Notes: Week 1

Applied Linear Algebra in Geoscience Using MATLAB

MAT1302F Mathematical Methods II Lecture 19

EIGENVALUE PROBLEMS (EVP)

The German word eigen is cognate with the Old English word āgen, which became owen in Middle English and own in modern English.

Lecture: Face Recognition and Feature Reduction

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

The Singular Value Decomposition

The University of Texas at Austin Department of Electrical and Computer Engineering. EE381V: Large Scale Learning Spring 2013.

Final Exam, Linear Algebra, Fall, 2003, W. Stephen Wilson

Lecture 4 Orthonormal vectors and QR factorization

Announcements Monday, October 29

Travis Schedler. Thurs, Oct 27, 2011 (version: Thurs, Oct 27, 1:00 PM)

Notes on singular value decomposition for Math 54. Recall that if A is a symmetric n n matrix, then A has real eigenvalues A = P DP 1 A = P DP T.

Math 408 Advanced Linear Algebra

ACM106a - Homework 4 Solutions

The QR Decomposition

Lecture notes: Applied linear algebra Part 1. Version 2

Implementing the QR algorithm for efficiently computing matrix eigenvalues and eigenvectors

Eigenvalues and Eigenvectors

Numerical Solution of Linear Eigenvalue Problems

Contents. Preface for the Instructor. Preface for the Student. xvii. Acknowledgments. 1 Vector Spaces 1 1.A R n and C n 2

Lecture 2 INF-MAT : , LU, symmetric LU, Positve (semi)definite, Cholesky, Semi-Cholesky

Maths for Signals and Systems Linear Algebra in Engineering

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Lecture 5 Singular value decomposition

Transcription:

Week 10: Wednesday and Friday, Oct 24 and 26 Orthogonal iteration to QR On Monday, we went through a somewhat roundabout algbraic path from orthogonal subspace iteration to the QR iteration. Let me start this lecture with a much more concise version: 1. The orthogonal iteration Q (k+1) R (k) = AQ (k) is a generalization of the power method. In fact, the first column of this iteration is exactly the power iteration. In general, the first p columns of Q (k) are converging to an orthonormal basis for a p-dimensional invariant subspace associated with the p eigenvalues of A with largest modulus (assuming that there aren t several eigenvalues with the same modulus to make this ambiguous). 2. If all the eigenvalues have different modulus, orthogonal iteration ultimately converges to the orthogonal factor in a Schur form AU = UT What about the T factor? Note that T = U AU, so a natural approximation to T at step k would be A (k) = (Q (k) ) AQ (k), and from the definition of the subspace iteration, we have A (k) = (Q (k) ) Q (k+1) R (k) = Q (k) R (k), where Q (k) (Q (k) ) Q (k+1) is unitary. 3. Note that A (k+1) = (Q (k+1) ) A (k) Q (k+1) = (Q (k) ) A (k) Q (k) = R (k) Q (k). Thus, we can go from A (k) to A (k+1) directly without the orthogonal factors from subspace iteration, simply by computing This is the QR iteration. A (k) = Q (k) R (k) A (k+1) = R (k) Q (k).

Hessenberg matrices and QR steps in O(n 2 ) A matrix H is said to be upper Hessenberg if it has nonzeros only in the upper triangle and the first subdiagonal. For example, the nonzero structure of a 5-by-5 Hessenberg matrix is. For any square matrix A, we can find a unitarily similar Hessenberg matrix H = Q AQ by the following algorithm (see for comparison the Householder QR code in lecture 18): function [H,Q] = lec27hess(a) % Compute the Hessenberg decomposition H = Q *A*Q using % Householder transformations. n = length(a); Q = eye(n); H = A; % Orthogonal transform so far % Transformed matrix so far for j = 1:n-2 % -- Find W = I-2vv to put zeros below H(j+1,j) u = H(j+1:end,j); u(1) = u(1) + sign(u(1))*norm(u); v = u/norm(u); end % -- H := WHW, Q := QW H(j+1:end,:) = H(j+1:end,:)-2*v*(v *H(j+1:end,:)); H(:,j+1:end) = H(:,j+1:end)-(H(:,j+1:end)*(2*v))*v ; Q(:,j+1:end) = Q(:,j+1:end)-(Q(:,j+1:end)*(2*v))*v ; A Hessenberg matrix H is very nearly upper triangular, and is an interesting object in its own right for many applications. For example, in control

theory, one sometimes would like to evaluate a transfer function h(s) = c T (si A) 1 b + d for many different values of s. Done naively, it looks like each each evaluation would require O(n 3 ) time in order to get a factorization of si A; but if H = Q AQ is upper Hessenberg, we can write h(s) = (Qc) (si H) 1 (Qb) + d, and the Hessenberg structure of si H allows us to do Gaussian elimination on it in O(n 2 ) time. Just as it makes it cheap to do Gaussian elimination, the special structure of the Hessenberg matrix also makes the Householder QR routine very economical. The Householder reflection computed in order to introduce a zero in the (j + 1, j) entry needs only to operate on rows j and j + 1. Therefore, we have Q H = W n 1 W n 2... W 1 H = R, where W j is a Householder reflection that operates only on rows j and j + 1. Computing R costs O(n 2 ) time, since each W j only affects two rows (O(n) data). Now, note that RQ = R(W 1 W 2... W n 1 ); that is, RQ is computed by an operation that first mixes the first two columns, then the second two columns, and so on. The only subdiagonal entries that can be introduced in this process lie on the first subdiagonal, and so RQ is again a Hessenberg matrix. Therefore, one step of QR iteration on a Hessenberg matrix results in another Hessenberg matrix, and a Hessenberg QR step can be performed in O(n 2 ) time. Putting these ideas in concrete form, we have the following code function H = lec27hessqr(h) % Basic Hessenberg QR step via Householder transformations. n = length(h); V = zeros(2,n-1); % Compute the QR factorization

for j = 1:n-1 end % -- Find W_j = I-2vv to put zero into H(j+1,j) u = H(j:j+1,j); u(1) = u(1) + sign(u(1))*norm(u); v = u/norm(u); V(:,j) = v; % -- H := W_j H H(j:j+1,:) = H(j:j+1,:)-2*v*(v *H(j:j+1,:)); % Compute RQ for j = 1:n-1 end % -- H := WHW, Q := QW v = V(:,j); H(:,j:j+1) = H(:,j:j+1)-(H(:,j:j+1)*(2*v))*v ; Inverse iteration and the QR method When we discussed the power method, we found that we could improve convergence by a spectral transformation that mapped the eigenvalue we wanted to something with large magnitude (preferably much larger than the other eigenvalues). This was the shift-invert strategy. We already know there is a connection leading from the power method to orthogonal iteration to the QR method, which we can summarize with a small number of formulas. Let us see if we can follow the same path to uncover a connection from inverse iteration (the power method with A 1, a special case of shift-invert in which the shift is zero) to QR. If we call the orthogonal factors in orthogonal iteration Q (k) (Q (0) = I) and the iterates in QR iteration A (k), we have (1) (2) A k = Q (k) R (k) A (k) = (Q (k) ) A(Q (k) ).

In particular, note that because R (k) are upper triangular, A k e 1 = (Q (k) e 1 )r (k) 11 ; that is, the first column of Q (k) corresponds to the kth step of power iteration starting at e 1. What happens when we consider negative powers of A? Inverting (1), we find A k = (R (k) ) 1 (Q (k) ) The matrix R (k) = (R (k) ) 1 is again upper triangular; and if we look carefully, we can see in this fact another power iteration: e na k = e n R (k) (Q (k) ) = r (k) nn (Q (k) e n ). That is, the last column of Q (k) corresponds to a power iteration converging to a row eigenvector of A 1. Shifting gears The connection from inverse iteration to orthogonal iteration (and thus to QR iteration) gives us a way to incorporate the shift-invert strategy into QR iteration: simply run QR on the matrix A σi, and the (n, n) entry of the iterates (which corresponds to a Rayleigh quotient with an increasingly-good approximate row eigenvector) should start to converge to λ σ, where λ is the eigenvalue nearest σ. Put differently, we can run the iteration: Q (k) R (k) = A (k 1) σi A (k) = R (k) Q (k) + σi. If we choose a good shift, then the lower right corner entry of A (k) should converge to the eigenvalue closest to σ in fairly short order, and the rest of the elements in the last row should converge to zero. The shift-invert power iteration converges fastest when we choose a shift that is close to the eigenvalue that we want. We can do even better if we choose a shift adaptively, which was the basis for running Rayleigh quotient iteration. The same idea is the basis for the shifted QR iteration: (3) (4) Q (k) R (k) = A (k 1) σ k I A (k) = R (k) Q (k) + σ k I.

This iteration is equivalent to computing Q (k) R (k) = n (A σ j I) j=1 A (k) = (Q (k) ) A(Q (k) ) Q (k) = Q (k) Q (k 1)... Q (1). What should we use for the shift parameters σ k? A natural choice is to use σ k = e na (k 1) e n, which is the same as σ k = (Q (k) e n ) A(Q (k) e n ), the Rayleigh quotient based on the last column of Q (k). This simple shifted QR iteration is equivalent to running Rayleigh iteration starting from an initial vector of e n, which we noted before is locally quadratically convergent. Double trouble The simple shift strategy we described in the previous section gives local quadratic convergence, but it is not globally convergent. As a particularly pesky example, consider what happens if we want to compute a complex conjugate pair of eigenvalues of a real matrix. With our simple shifting strategy, the iteration (3) (4) will never produce a complex iterate, a complex shift, or a complex eigenvalue. The best we can hope for is that our initial shift is closer to both eigenvalues in the conjugate pair than it is to anything else in the spectrum; in this case, we will most likely find that the last two columns of Q (k) are converging to a basis for an invariant row subspace of A, and the corresponding eigenvalues are the eigenvalues of the trailing 2-by-2 sub-block. Fortunately, we know how to compute the eigenvalues of a 2-by-2 matrix! This suggests the following shift strategy: let σ k be one of the eigenvalues of A (k) (n 1 : n, n 1 : n). Because this 2-by-2 problem can have complex roots even when the matrix is real, this shift strategy allows the possibility that we could converge to complex eigenvalues. On the other hand, if our original matrix is real, perhaps we would like to consider the real Schur form, in which U is a real matrix and T is block diagonal with 1-by-1 and 2-by-2 diagonal blocks that correspond, respectively, to real and complex eigenvalues. If we

shift with both roots of A (k) (n 1 : n, n 1 : n), equivalent to computing Q (k) R (k) = (A (k 1) σ k+ I)(A (k 1) σ k ) A (k) = (Q (k) ) A (k 1) Q (k). There is one catch here: even if we started with A (0) in Hessenberg form, it is unclear how to do this double-shift step in O(n 2 ) time! The following fact will prove our salvation: if we Q and V are both orthogonal matrices and Q T AQ and V T AV are both (unreduced) Hessenberg 1 ) and the first column of Q is the same as the first column of V, then all successive columns of Q are unit scalar multiples of the corresponding columns of V. This is the implicit Q theorem. Practically, it means that we can do any sort of shifted QR step we would like in the following way: 1. Apply as a similarity any transformations in the QR decomposition that affect the leading submatrix (1-by-1 or 2-by-2). 2. Restore the resulting matrix to Hessenberg form without further transformations to the leading submatrix. In the first step, we effectively compute the first column of Q; in the second step, we effectively compute the remaining columns. Certainly we compute some transformation with the right leading column; and the implicit Q theorem tells us that any such transformation is basically the one we would have computed with an ordinary QR step. 1 An unreduced Hessenberg matrix has no zeros on the first subdiagonal.