EIGENVALUE PROBLEMS (EVP)

Similar documents
EIGENVALUE PROBLEMS. EIGENVALUE PROBLEMS p. 1/4

Eigenvalue and Eigenvector Problems

Matrices, Moments and Quadrature, cont d

Eigenvalues and Eigenvectors

Eigenvalue Problems. Eigenvalue problems occur in many areas of science and engineering, such as structural analysis

Scientific Computing: An Introductory Survey

Numerical Methods I Eigenvalue Problems

Computing Eigenvalues and/or Eigenvectors;Part 2, The Power method and QR-algorithm

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Solving large scale eigenvalue problems

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

Math 504 (Fall 2011) 1. (*) Consider the matrices

Eigenvalues and eigenvectors

5.3 The Power Method Approximation of the Eigenvalue of Largest Module

AMS526: Numerical Analysis I (Numerical Linear Algebra)

EIGENVALUE PROBLEMS. Background on eigenvalues/ eigenvectors / decompositions. Perturbation analysis, condition numbers..

Computational Methods. Eigenvalues and Singular Values

13-2 Text: 28-30; AB: 1.3.3, 3.2.3, 3.4.2, 3.5, 3.6.2; GvL Eigen2

Solving large scale eigenvalue problems

Eigenvalue Problems and Singular Value Decomposition

EECS 275 Matrix Computation

QR-decomposition. The QR-decomposition of an n k matrix A, k n, is an n n unitary matrix Q and an n k upper triangular matrix R for which A = QR

The Eigenvalue Problem: Perturbation Theory

The QR Algorithm. Marco Latini. February 26, 2004

Orthogonal iteration to QR

Numerical Solution of Linear Eigenvalue Problems

Computing Eigenvalues and/or Eigenvectors;Part 2, The Power method and QR-algorithm

Index. for generalized eigenvalue problem, butterfly form, 211

Eigenvalues, Eigenvectors, and Diagonalization

The German word eigen is cognate with the Old English word āgen, which became owen in Middle English and own in modern English.

Computational Methods CMSC/AMSC/MAPL 460. Eigenvalues and Eigenvectors. Ramani Duraiswami, Dept. of Computer Science

Eigenvalues, Eigenvectors, and Diagonalization

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

Math 405: Numerical Methods for Differential Equations 2016 W1 Topics 10: Matrix Eigenvalues and the Symmetric QR Algorithm

ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

ECS130 Scientific Computing Handout E February 13, 2017

Notes on Eigenvalues, Singular Values and QR

Lecture # 11 The Power Method for Eigenvalues Part II. The power method find the largest (in magnitude) eigenvalue of. A R n n.

Chasing the Bulge. Sebastian Gant 5/19/ The Reduction to Hessenberg Form 3

Math 489AB Exercises for Chapter 2 Fall Section 2.3

Linear Algebra: Matrix Eigenvalue Problems

Lecture 4 Eigenvalue problems

Synopsis of Numerical Linear Algebra

arxiv: v1 [math.na] 5 May 2011

Numerical Methods for Solving Large Scale Eigenvalue Problems

Lecture 10 - Eigenvalues problem

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

5 Selected Topics in Numerical Linear Algebra

Definition (T -invariant subspace) Example. Example

Chap 3. Linear Algebra

SECTIONS 5.2/5.4 BASIC PROPERTIES OF EIGENVALUES AND EIGENVECTORS / SIMILARITY TRANSFORMATIONS

Draft. Lecture 14 Eigenvalue Problems. MATH 562 Numerical Analysis II. Songting Luo. Department of Mathematics Iowa State University

Linear Algebra 1. M.T.Nair Department of Mathematics, IIT Madras. and in that case x is called an eigenvector of T corresponding to the eigenvalue λ.

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Numerical Methods - Numerical Linear Algebra

MAA507, Power method, QR-method and sparse matrix representation.

Linear algebra & Numerical Analysis

Foundations of Matrix Analysis

The Singular Value Decomposition

Lecture 2: Numerical linear algebra

Math Spring 2011 Final Exam

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

Krylov subspace projection methods

Numerical Methods in Matrix Computations

1 Number Systems and Errors 1

LinGloss. A glossary of linear algebra

Chapter 7. Canonical Forms. 7.1 Eigenvalues and Eigenvectors

HOMEWORK PROBLEMS FROM STRANG S LINEAR ALGEBRA AND ITS APPLICATIONS (4TH EDITION)

Orthogonal iteration to QR

The QR Decomposition

Eigenvalue problems. Eigenvalue problems

18.06SC Final Exam Solutions

Solving linear equations with Gaussian Elimination (I)

Numerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??

Matrix Algorithms. Volume II: Eigensystems. G. W. Stewart H1HJ1L. University of Maryland College Park, Maryland

Gaussian Elimination for Linear Systems

G1110 & 852G1 Numerical Linear Algebra

Eigenpairs and Similarity Transformations

Section 4.5 Eigenvalues of Symmetric Tridiagonal Matrices

Eigenvalues and Eigenvectors

A = 3 1. We conclude that the algebraic multiplicity of the eigenvalues are both one, that is,

6.4 Krylov Subspaces and Conjugate Gradients

Recall : Eigenvalues and Eigenvectors

Linear algebra and applications to graphs Part 1

Introduction to Applied Linear Algebra with MATLAB

1 Linear Algebra Problems

Implementing the QR algorithm for efficiently computing matrix eigenvalues and eigenvectors

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

Numerical Linear Algebra Homework Assignment - Week 2

Econ Slides from Lecture 7

Numerical Analysis Lecture Notes

w T 1 w T 2. w T n 0 if i j 1 if i = j

Eigenvalues and Eigenvectors

Direct methods for symmetric eigenvalue problems

Numerical Linear Algebra

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

Cheat Sheet for MATH461

Transcription:

EIGENVALUE PROBLEMS (EVP) (Golub & Van Loan: Chaps 7-8; Watkins: Chaps 5-7) X.-W Chang and C. C. Paige PART I. EVP THEORY EIGENVALUES AND EIGENVECTORS Let A C n n. Suppose Ax = λx with x 0, then x is a (right) eigenvector of A, corresponding to the eigenvalue λ. If y H A = λy H, y 0, then y is called a left eigenvector of A. (eigen=latent=characteristic value or root) Q: Prove that if x is an eigenvector, so is αx, for any scalar α 0. We often take x =. For an eigenpair (λ,x), (λi A)x = 0, x 0. Q: Which is only possible if? For general λ,π(λ) det (λi A) is called the characteristic polynomial of A π(λ) = λ n (a + + a nn )λ n + + det( A), and π(λ) = 0 is called the characteristic equation. π(λ) has exact degree n, and so has n (not necessarily distinct) zeros λ,...,λ n, say. The set Λ(A) = {λ,...,λ n } is called the spectrum of A. 2 3 Q: What are the eigenvalues of 2, 4 5? 3 4 If an eigenvalue λ is repeated exactly r times, we say it has algebraic multiplicity r. (We use the adjective algebraic because we are dealing with the number of roots of the algebraic equation). Q: Can real A have complex eigenvalues? Let i =, λ = µ + iν, so λ = µ iν. If Ax = λx and λ is complex, so is x = u + iv. If A is real, then A x = λ x. To avoid working in C n, note: [ ] [ ] [ ] [ ] λ µ + iν A[x x] = A[u v] = [x x] = [u v]. i i λ i i µ iν }{{} C [ ][ ] i Multiplying on right by C H /2, and using CC H = = i i i [ ] µ ν A[u,v] = [u,v], ν µ [ 2 0 0 2 ], we have

[ ] µ ν where now all matrices are real. Note that has eigenvalues λ and ν µ λ. Thus for real A with complex λ, we can work in C n with vectors x, or in R n with 2 dimensional subspaces [u v]. Note: When working with real matrices, we try to stay within the real number system as much as possible, because a complex operation costs much more than a real operation. A subspace S of C n is an invariant subspace of A if Ax S whenever x S. Let A C n n, X C n p and B C p p. If AX = XB, and X has linearly indepent columns, then R(X) is an invariant subspace of A. Q: For λ / R, show [u v] above is an invariant subspaces of A and u and v are linearly indepent. Q: Let x i be an eigenvector of A corresponding to eigenvalue λ i. Are the x,...,x n linearly indepent? If an eigenvalue λ has exactly s linearly indepent eigenvectors, we say it has geometric multiplicity s (number of vectors geometric ). SIMILARITY TRANSFORMATIONS (ST) If A and X are n n, and X is nonsingular, B X AX is said to be a similarity transformation of A, and A and B are similar. Theorem. Similar matrices have the same characteristic polynomial. Q: Prove this. Corollary. Similarity transformations preserve eigenvalues and algebraic multiplicities. Also if Ax = λx with x 0, then B(X x) = X AXX x = λ(x x) and if x,...,x r are linearly indepent, so are X x,...,x x r and vice versa. Thus similarity transformations preserve geometric multiplicities too. An important class of computational methods for the EVP uses similarity transformations to transform n n A to a simple form from which the eigenvalues are easily found. Q: Suggest some such forms? Q: What is the most simple form we can transform a general square matrix to using similarity transformations? 2

JORDAN CANONICAL FORM (JCF) The k k matrix J (k) λ = λ λ λ λ is called a Jordan block. It has one distinct eigenvalue λ with algebraic multiplicity k and geometric multiplicity, since the eigenvector is a multiple of e ; for J (k) λ x = λx with x = [ξ,...,ξ k ] T implies so ξ i+ = 0, i =,...,k. λξ i + ξ i+ = λξ i, Theorem. Let A C n n, then there exist unique numbers λ,λ 2,...,λ s (complex, not necessarily distinct), and unique positive integers m,m 2,...,m s, and nonsingular X C n n giving the JCF of A X AX = J diag[j (m ) λ,...,j (ms) λ s ], = λ λ λ λ 2 We see AX = XJ, so with X = [x,,x n ], Ax = λ x, Ax j = λ x j + x j, j = 2,...,m, so (A λ I)x j = x j, (A λ I) j x j = 0, j =,2,...,m. x,...,x n are called principal vectors or generalized eigenvectors of A. Q: Which of these are eigenvectors? Q: Can we find the algebraic multiplicity and geometric multiplicity of an eigenvalue of A from its JCF? Q: What is the relationship between the algebraic multiplicity and geometric multiplicity of an eigenvalue? If all the Jordan blocks corresponding to the same eigenvalue have dimension (i.e., the eigenvalue s algebraic multiplicity is equal to its geometric multiplicity), the eigenvalue is said to be nondefective or semisimple, else defective. If all the eigenvalues of A are nondefective, A is said to be nondefective or semisimple. In this case, J is a diagonal matrix and A is said to be diagonalizable. If an eigenvalue 3

appears in two or more Jordan blocks (i.e., the eigenvalue s geometric multiplicity is larger than ), it is said to be derogatory, else nonderogatory. If A has a derogatory eigenvalue, A is said to be derogatory, else nonderogatory. If an eigenvalue is distinct (i.e., the eigenvalue s algebraic multiplicity is equal to ), it is said to be simple. Q: Suppose X AX = J = 2. Give the indepent eigenvectors of A. Is A defective? Is 2 2 A derogatory? Q: If a matrix A C n n has n distinctive eigenvalues, is A diagonalizable? For a general matrix A, X in X AX = J can be very ill-conditioned. Thus rounding errors in the JCF computation can be greatly magnified, since fl(x AX) = X AX + E, E 2 = O(u)κ 2 (X) A 2. Another problem is that a small perturbation in A may change the dimensions of the Jordan blocks completely. The JCF is classical theory it is important in describing properties of various linear differential equations, and so is important in control theory etc.. But we want to avoid the JCF in computations if possible. SENSITIVITY OF EIGENVALUES Nondefective matrix: Theorem. Let A C n n be nondefective and suppose X AX = D where D is diagonal. Let δa C n n be a perturbation of A and let µ be an eigenvalue of A + δa. Then A has an eigenvalue λ such that Proof. min λ µ κ p(x) δa p, p. λ Λ(A) Note: If X is unitary (this happens when A is normal, see later), then κ 2 (X) = and the absolute condition number (in terms of the 2-norm) of each eigenvalue of A is. Note: κ p (X) is an overall condition number. It is possible that different eigenvalues may have different sensitivity. 4

Defective eigenvalues: Consider the Jordan block J (k) λ = λ λ λ λ It has eigenvalue λ with algebraic multiplicity k and geometric multiplicity, that is, only one eigenvector e. Perturbations above the diagonal have no effect on the eigenvalues. But consider a perturbation in the (k,) entry: λ λ ǫ λ λ y 2 y k y k λ + y 2 = λ + δλ;. = (λ + δλ) y 2 y k y k, for i = 2,...,k, λy i + y i+ = λy i + δλy i, so y i+ = (δλ) i ; ǫ + λy k = λy k + δλy k, so ǫ = (δλ) k. Thus λ + ǫ /k is an eigenvalue, and [,ǫ /k,...,ǫ (k )/k ] T is its eigenvector. For example, λ =, k = 0, ǫ = 0 0, ω j = exp{ 2πj 0 }, j =,...,0 (0 tenth roots of ), λ j = + 0.ω j, j =,...,0, huge change. In general, if J (k) λ i is in the Jordan canonical form of A, then A+δA with δa = ǫ can give perturbations O(ǫ /k ) in λ i. Defective eigenvalues are very sensitive. Simple eigenvalues: Suppose λ is a simple eigenvalue of A C n n. Let X AX = J be the JCF. Then with Y X H, we have AX = XJ, Y H A = JY H. It follows that for some i, we have AX(:,i) = λx(:,i), Y (:,i) H A = λy (:,i) H, Y (:,i) H X(:,i) =. Let x X(:,i)/ X(:,i) 2 and y Y (:,i)/ Y (:,i) 2. Suppose we have a small perturbation δa in A and the corresponding perturbations in λ and x are denoted by δλ and δx. Thus (A + δa)(x + δx) = (λ + δλ)(x + δx). Using Ax = λx and ignoring the second order terms, we have Aδx + δax = δλx + λδx. Multiplying the above equation by y H from the right gives y H Aδx + y H δax = δλy H x + λy H δx. 5

Since y H A = λy H and y H x 0, we have δλ = yh δax y H x δa 2 y H x. We call κ(a,λ) y H x the condition number of the eigenvalue λ. The reason that the condition that λ is a simple eigenvalue was imposed is that under this condition x and y are uniquely determined up to complex scalars of modulus. So κ(a,λ) is uniquely determined. See also Golub and Van Loan 7.2, Watkins 7. for many other perturbation results. PART II. EVP COMPUTATIONS SCHUR DECOMPOSITION Unitary transformations preserve size. If Q C n n is unitary (i.e., QQ H = Q H Q = I), B Q H AQ is a unitary similarity transformation of A. Q: What are the relations between the eigenvalues and singular values of B and those of A? Q: What is simplest form of the above B? Schur s Theorem. Let A C n n. There exists a similarity transformation with unitary Q such that Q H AQ = R, where R is upper triangular. (Golub and Van Loan p33, Watkins p338). Proof. We prove the result by induction on the dimension of A. If A is a scalar, obviously the result holds. Suppose Ax = λx, x H x =. Let Q C n n be unitary such that Q H x = e. Thus we can write Q = [x,f]. Then [ Q H λ x AQ = H ] AF 0 F H. AF By the hypothesis induction, there exists a unitary Q C (n ) (n ) such that [ ] 0 Set Q = Q. Then 0 Q Q H F H AF Q = R, Q H AQ = R upper triangular. [ λ x H ] AF R. 0 R Real Schur decomposition: If A R n n, then there exists real orthogonal Q, giving Q T AQ = R, block upper triangular, with at most 2 2 blocks on the diagonal corresponding to complex conjugate pairs, e.g., R =, two complex conjugate pairs, and one real eigenvalues. 6

The proof is similar to that for (complex) Schur s decomposition, using real invariant subspaces, see Golub and Van Loan p34.) Note: In either case, the diagonal or diagonal blocks can be placed in any desired order. HERMITIAN MATRICES A = A H Theorem. Hermitian matrices have real eigenvalues, and a complete set of unitary eigenvectors Q. Proof. Any matrix having a complete set of unitary eigenvectors is called a normal matrix. [ ] A normal matrix may not be Hermitian, e.g.,. i Q: State an important class of not necessarily Hermitian matrices which are normal? Q: Show that A is normal if and only if AA H = A H A. Theorem. Real symmetric matrices have a complete set of real orthogonal eigenvectors. Proof. A = A T R n n is Hermitian, so it has real eigenvalues. Thus there are no complex conjugate pairs in the real Schur form, and the real Schur form is diagonal. ALGORITHMS FOR THE EVP If A is real, A = A T, A = Qdiag(λ i )Q T (Q T Q = I), the eigecomposition of A, is (apart from signs) the same as the SVD. Q: What is one way of computing this? For a general A C n n, there exists unitary Q such that Q H AQ = R is upper triangular. Also Q H AQ 2,F = A 2,F, which is numerically desirable. So we seek such Q and R. Q: Why must methods be iterative for the EVP? The form of such iterative algorithms for computing the Schur form is: A = A, A k+ = Q H k A kq k, Q k is unitary,k =,2,... = Q H k QH k QH AQ Q k Q k, which is a unitary similarity transformation of A. We try to design the Q k so A k upper triangular. BASIC QR ALGORITHM A := A for k =,2,... until convergence QR factorization of A k : A k = Q k R k upper triangular, Q k unitary, recombine in the reverse order: A k+ := R k Q k 7

Note: A k+ = Q H k A kq k = Q H k QH AQ Q k is a unitary similarity transformation of A. The eigenvalues, singular values etc. are preserved. With refinements this is one of the most effective matrix algorithms for transforming A to upper triangular form using unitary similarity transformations. Note: If A is real, the algorithm involves only real operations. In the following we present some strategies to improve the efficiency of the basic algorithm. HESSENBERG REDUCTION Q: How many flops does one QR step take? This is very expensive. If A k is close to the upper triangular structure, we can significant reduce the cost. So we reduce A to upper Hessenberg form,, which is the closest we can get to upper triangular form using a fixed number of unitary similarity transformations. Hessenberg reduction process, using Householder transformations (Golub and Van Loan p345, Watkins pp.350 355): {{ {{ design H H form H HA apply H (H HA)H design H2 H form H2 H(HH AH ) apply H 2 (H2 HHH AH )H 2 etc. If A is real, the full reduction takes 0n 3 /3 flops and if Q 0 = H H n 2 is explicitly formed, an additional 4n 3 /3 flops are required (Golub and Van Loan p345). Hence in the Basic QR Algorithm we take A = H H n 2 H H AH H n 2. Q: Do we lose the upper Hessenberg form in the QR steps? In the following, G i is short for G (k) i, a rotation. QR factorization Q H k A k = R k, Q H k = GH n GH 2 GH : 2 Recombine A k+ = R k Q k = R k G G 2 G n : 3 {}} { 2 {}} { 3 {}} { 8

It takes about 6n 2 flops for each QR step if A is real (Golub Van Loan p343). SHIFTING Let λ λ n. If λ i > λ i+, it can be shown that the subdiagonal entry a (k) i+,i in A k goes to zero as ( λ i+ / λ i ) k when k, so it has linear convergence. We can shift the eigenvalues by using A µi in place of A. The eigenvalues of A µi are λ µ,λ 2 µ,...,λ n µ. Suppose we renumber the eigenvalues so that λ µ λ n µ. Then the new ratios associated with A µi are λ i+ µ / λ i µ, i =,...,n. Q: How should we choose µ to obtain fast convergence? If we apply the QR iterations to à = A µi instead of A, then the entry ã (k) n,n will converge to zero quickly. Suppose after k 0 QR steps, ã (k 0) n,n is small enough, we regard it as zero and add the shift back: à k0 + µi = 0 λ Then λ is an eigenvalue of A. The above process can be written as à := A µi for k = : k 0 à k = Q k Rk à k+ = R k Qk A k0 := Ãk 0 + µi Q: Show that the above process can be written as the below process: for k = : k 0 A k µi = Q k R k A k+ = R k Q k + µi But there is no reason to use the same shift µ in all the above QR steps. Also we usually can only find a good approximation to an eigenvalue during the QR iterations. So we should use different shifts in the algorithm for different QR steps. This idea results in: Shifted QR Algorithm with Hessenberg Reduction: Compute the Hessenberg reduction A := Q H 0 AQ 0, where Q 0 := H H n 2 for k=,2,... until convergence do A k µ k I = Q k R k A k+ =R k Q k + µ k I 9

Note: A k+ = Q H k (A k µ k I)Q k + µ k I = Q H k A kq k. Now Q k deps on µ k. With correct choice of shift we get quadratic convergence. If λ n λ n, it is expected that a (k) n,n 0. So a(k) nn will converge to an eigenvalue of A and we can take µ k = a (k) nn. This shift is called the Rayleigh quotient shift. There is another often used shift which will be introduced later. DEFLATION When a (k) n,n is small enough, it can be regarded as zero, and a(k) n,n is an (approximate) eigenvalue of A. We then deflate A k by ignoring the last row and column. 0 The remaining eigenvalues of A are the eigenvalues of A k ( : n, : n ). So we can start working with A k ( : n, : n ), i.e., apply the shifted QR algorithm to A k ( : n, : n ). Note that the dimension of the matrix is now reduced by one. During the iterations, it may happen that one of the subdiagonal entries a (k) i+,i (say) other than a(k) n,n becomes very small. Then we just regard it as zero, and work with two smaller matrices A k ( : i, : i) and A k (i + : n,i + : n), whose eigenvalues are (approximate) eigenvalues of A. At the of the computation, let Q = Q Q 2...Q k, R = R k so Q H AQ = R. Q: What must hold for the correctly applied QR algorithm to be numerically stable?. SPECIAL CASE A = A H Q: What does the upper Hessenberg form become? Q: Is it preserved in the QR algorithm? Q: What is the cost per QR step? The QR algorithm has cubic convergence and the eigenvectors are immediately available. Q H AQ D diagonal, so the eigenvectors are the columns of Q. NUMERICAL DIFFICULTY WITH SHIFTING Form A µi with e.g. µ = a nn. Suppose A = 0 6 0 4 0 4 0 2 0 2 0 6 0

If we use double precision (u. 0 6 ), subtracting 0 6 I from A would destroy elements 0 6 and, yet these are almost eigenvalues. So it loses information unnecessarily. We can use implicit shifts in order to avoid this loss. Q. Use MATLAB built-in function eig to found the eigenvalues of A 0 6 I. Then adding 0 6 to the computed eigenvalues gives the computed eigenvalues of A. Check how different they are from the computed eigenvalues of A by applying eig directly to A. IMPLICITLY SHIFTED QR ALGORITHM FOR REAL UNSYMMETRIC A Shifting with µ (an approximation of an eigenvalue) is necessary for fast convergence. Notice that a real unsymmetric A usually has complex conjugate pairs. A complex µ in Q H (A µi) requires complex arithmetic. But for a real matrix, we would like to avoid complex arithmetic as much as possible. When the QR algorithm converges to a complex conjugate pair, we find the (n,n 2) entry converges to 0, and eventually we can deflate. Q: How do we deflate here? 0 When the (n,n 2) entry is small, the eigenvalues µ,µ 2 of the bottom right hand corner 2 2 block are good approximations to eigenvalues of A. If they are real, take µ to be the one closer to the (n,n) entry. This shift is called the Wilkinson shift. If they are a complex conjugate pair, we could do one QR step with µ, the next with µ 2 = µ. Suppose A is real upper Hessenberg. One step of double QR with explicit shifts µ and µ is A µi = Q R, A 2 µi = Q 2 R 2, A 2 = R Q + µi, A 3 = R 2 Q 2 + µi. This has two drawbacks. One is that although A is real, complex arithmetic is involved in the above computation. The other is that explicit shifting may cause numerical difficulties. In the following, we try to avoid these two drawbacks. Q. Show that. A 3 = Q H 2 A 2Q 2 = (Q Q 2 ) H A (Q Q 2 ). 2. N (A µi)(a µi) is real and N = Q Q 2 R 2 R. 3. N ij = 0 for i j + 3, i.e., N has two nonzero subdiagonals. Since N is real, we can choose the Q-factor Q Q 2 of its QR factorization to be real. Thus suggests that we can obtain the QR factorization of N to get Q Q 2 and then use it to obtain A 3. This way avoids complex arithmetic. But there are two problems: i. Forming N costs O(n 3 ) too expensive; ii. Essentially this approach still involves explicit shifting. But we can avoid these two problems by the following algorithm.

One step of double QR iteration: Compute n = Ne = [τ,σ,ν,0,,0] T. Apply a real Householder H 0 to n such that H0 T n =... 0. 0 = 0. 0 = ρ e, Apply H0 T & H 0 to A from the left and from the right, respectively: H0 T A H 0 =. Use real Householder transformations to transform H T 0 A H 0 back to upper Hessenberg form: {}}{ H {}}{ 2 2 H 2 {}}{ 3 3 H 3 {}}{ n 2 (n 2) H n 2 à 3 = H T n 2 HT HT 0 A H 0 H H n 2 (chasing the bulge). à 3 is real, upper Hessenberg, and was obtained by real arithmetic, by applying orthogonal transformations directly to the unshifted A. This process costs about 0n 2 flops if we do not form H 0 H H n 2. Othewise it will cost 0n 2 flops more. Remember what we want is A 3. But à 3 given by the above algorithm is essentially just A 3. In order to understand this, we need the following results. The Implicit Q-Theorem. Suppose Q and P are two real orthogonal matrices such that Q T AQ and P T AP are both upper Hessenberg matrices, one of which is unreduced (i.e., all of its subdiagonal entries 2

are nonzero). If Qe = ±Pe, then Qe j = ±Pe j for j = 2,...,n and Q T AQ = D P T APD with D = diag(±,..., ±). (See Golub & Van Loan p346) Q. Show that (Q Q 2 )e = (H 0 H H n 2 )e = H 0 e. Q. Suppose A is unreduced Hessenberg and µ and µ are not its eigenvalues. Show that A 3 is unreduced Hessenberg. Hint: Check the computation procedure from A to A 3. From the above results, we can conclude that if A is unreduced Hessenberg and µ and µ are not the eigenvalues of A then Ã3 = D A 3 D. In other words, Ã 3 is essentially the same as A 3. Note. The double QR iteration can also be applied to two real shifts. The idea can be applied to single or multiple shifts. THE OVERALL PROCESS FOR REAL UNSYMMETRIC MATRICES (Golub and Van Loan p359) QR Algorithm: Given A R n n and a tolerance tol greater than the unit roundoff, this algorithm computes the real Schur decomposition Q T AQ = R. If Q and R are desired, then R is stored in A. If only the eigenvalues are desired, then diagonal blocks in R are stored in the corresponding positions in A. Step. Compute the Hessenberg reduction A := Q T AQ where Q = H H n 2. If the final Q is desired, form Q := H H n 2. Step 2. until q = n Set to zero all subdiagonal entries that satisfy: a i,i tol( a ii + a i,i ). Find the largest non-negative q and the smallest non-negative p such that p n p q q A = A A 2 A 3 0 A 22 A 23 0 0 A 33 p n p q q where A 33 is upper quasi-triangular and A 22 is unreduced. (Note: either p and q may be zero.) if q < n Perform one step of double QR iteration on A 22 : A 22 := Z T A 22 Z if Q is desired Q := Qdiag(I p,z,i q ) A 2 := A 2 Z A 23 := Z T A 23 Step 3. Upper triangularize all 2-by-2 diagonal blocks in A that have real eigenvalues and accumulate the transformations if necessary. This algorithm requires 25n 3 flops if Q and R are computed. If only the eigenvalues are desired, then 0n 3 flops are necessary. These flop counts are very approximate and are based on the empirical observation that on average only two steps of double QR iteration are required before the lower -by- or 2-by-2 decouples. 3

SOME USES OF THE QR ALGORITHM. Finding eigenvalues and vectors of real symmetric matrices. 2. Finding eigenvalues and vectors of real unsymmetric matrices. 3. Finding eigenvalues and vectors of general matrices. 4. Finding eigenvalues and vectors of matrices with special structure. 5. Solving the generalized EVP Ax = λbx, x 0, so det(a λb) = 0. NB: QAZ(Z x) = λqbz(z x) QZ algorithm (GVL 7.7). Find unitary Q and Z, so are upper triangular. Thus 6. Computing the SVD. Â = QAZ, and ˆB = QBZ det(a λb) = 0 = det(â λ ˆB) when ˆα ii = λ i ˆβii, i =,...,n. 7. Computing the generalized SVD. (possible, not recommed). 8. Computing angles between subspaces. (possible, not recommed). 9. Computing principal components (via the SVD). 0. Computing canonical correlations. (possible, not recommed).. Eigenvalue allocation compute F, so A + BF has desired eigen properties, e.g. eigenvalues. (pole placement.) INVERSE ITERATION FOR EIGENVECTORS The QR algorithm can be used to compute eigenvectors, see Watkins 5.7. But we introduce a simpler method here. Inverse Iteration: Given A C n n. Let µ be an approximation to an eigenvalue λ j such that 0 < λ j µ λ i µ (i j). This algorithm computes the eigenvector corresponding to λ j. Choose q 0 with q 0 2 =. for k =, 2,... Solve (A µi)z k = q k q k = z k / z k 2 Stop if (A µi)q k 2 cu A 2, where c is a constant of order unity. Note:. Using LU factorization with partial pivoting to solve (A µi)z k = q k, and replace any pivot < u A by u A, this works even when A µi is singular. We can show that the ill-condition of A µi will not spoil the computation ill-condition may cause a big error in the computed z k, but the error in the direction of z k is small. 2. The stopping criterion forces µ and q k to be an exact eigenpair for a nearby problem: (A + E k )q k = µq k, 4

where E k = r k q T k with r k (A µi)q k. 3. When we use a known eigenvalue to find the corresponding eigenvector, usually only one iteration step is needed in practice. If one step does not give desired result, start with a new initial vector. Convergence Analysis Assume A is nondefective, AX = Xdiag(λ i ). We can write q 0 = Xa = n i= a ix i, where we assume a j 0. Thus Since λ j µ λ i µ, n (A µi) k q 0 = (A µi) k a i x i = i.e. q k rapidly converges to the eigenvector of A. = i= n i= a i (λ i µ) k x i. ( (λ j µ) k a j x j + a i ( λ j µ ) λ i µ )k x i. i j q k = (A µi) k q 0 (A µ) k ± as k, q 0 2 x j 2 If the eigenvalues {λ i } of A have been found by the QR algorithm, we can apply the inverse iteration to A, where A is the Hessenberg matrix obtained by the Hessenberg reduction to A, i.e. A = Q T 0 AQ 0, to find the corresponding eigenvectors. Inverse iteration with A is economical because solving (A µi)z k = q k costs O(n 2 ) flops. Suppose by the inverse iteration we obtain y i, the the eigenvector of A corresponding to the eigenvalue λ i, i.e. A y i = λ i y i, y i 2 =, then x i = Q 0 y i gives i.e. x i is a unit eigenvector of A corresponding to λ i. x j Ax i = λ i x i, x i 2 =, 5