Lecture 4 Eigenvalue problems Weinan E 1,2 and Tiejun Li 2 1 Department of Mathematics, Princeton University, weinan@princeton.edu 2 School of Mathematical Sciences, Peking University, tieli@pku.edu.cn No.1 Science Building, 1575
Outline Review Power method QR method
Eigenvalue problem Eigenvalue problem Find λ and x such that Ax = λx, x 0. λ is called the eigenvalues of A which satisfies the eigenpolynomial det(λi A) = 0, x is called the eigenvector corresponds to λ. The are n complex eigenvalues according to Fundamental Theorem of Algebra.
Eigenvalue problem for symmetric matrix Theorem (For symmetric matrix) The eigenvalue problem for real symmetric matrix has the properties 1. The eigenvalues are real, i.e. λ i R, i = 1,..., n. 2. The multiplicity of a eigenvalue to the eigenpolynomial = the number of linearly independent eigenvectors corresponding to this eigenvalue. 3. The linearly independent eigenvectors are orthogonal each other. 4. A has the following spectral decomposition A = QΛQ T where Q = (x T 1,, x T n ), Λ = diag(λ 1,, λ n).
Variational form for symmetric matrix Theorem (Courant-Fisher Theorem) Suppose A is symmetric, and the eigenvalues λ 1 λ n, if we define the Rayleigh quotient as then we have, λ 1 = max R A (u), R A (u) = ut Au u T u λ n = min R A (u).
Jordan form for non-symmetric matrix Theorem (Jordan form) Suppose A C n n, if A has r different eigenvalues λ 1,..., λ r with multiplicity n 1,..., n r, then there exists nonsingular P such that A has the following decomposition A = P JP 1 where J = diag(j 1,..., J r), and λ k 1. J k = λ.. k... 1, k = 1,..., r λ k
Gershgorin s disks theorem Definition Suppose that n 2 and A C n n. The Gershgorin discs D i, i = 1, 2,..., n, of the matrix A are defined as the closed circular regions D i = {z C : z a ii R i} in the complex plane, where is the radius of D i. n R i = j=1, j i a ij Theorem (Gershgorin theorem) All eigenvalues of the matrix A lie in the region D = n i=1d i, where D i are the Gershgorin discs of A.
Review Power method QR method Gershgorin s disks theorem Geometrical interpretation of Gershgorin s disks theorem for 30 1 2 A = 4 15 4 1 0 3 3 15 30
Outline Review Power method QR method
Basic idea of power method First suppose A is diagonizable, i.e. A = P ΛP 1 and Λ = diag(λ 1,..., λ n). We will assume λ 1 > λ 2 λ n in the follows and assume x i are the eigenvectors corresponding to λ i. For any initial u 0 = α 1x 1 + + α nx n, where α k C. We have n n A k u 0 = α j A k x j = α j λ k j x j j=1 j=1 ( n = λ k ( λ j ) ) kxj 1 α 1 x 1 + α j λ j=2 1
Power method We have A k u 0 lim k λ k = α 1 x 1. 1 Though λ 1 and α 1 is not known, the direction of x 1 is enough! Power method 1. Set up initial u 0, k = 1; 2. Perform a power step y k = Au k 1 ; 3. Find the maximal component for the absolute value of µ k = y k ; 4. Normalize u k = 1 µ k y k and repeat. We will have u k x 1, µ k λ 1.
Power method: example Example 1: compute the eigenvalue with largest modulus for 30 2 3 13 A = 5 11 10 8 9 7 6 12 4 14 15 1 Example 2: compute the eigenvalue with largest modulus for the second order ODE example (n=30)
Power method Theorem (Convergence of power method) If the eigenvalues of A has the order λ 1 > λ 2 λ p (counting multiplicity), and the algebraic multiplicity of λ 1 is equal to the geometric multiplicity. Suppose the projection of u 0 to the eigenspace of λ 1 is not 0, then the iterating sequence is convergent u k x 1, µ k λ 1, and the convergence rate is decided by λ 2 λ 1.
Shifted power method Shifted power method: Since the convergence rate is decided by λ 2, if λ 2 λ 1 λ 1 1, the convergence will be slow. An idea to overcome this issue is to shift the eigenvalues, i.e. to apply power method to B = A µi (µ is suitably chosen) such that λ 2(B) λ2 µ = λ 1(B) λ 1 µ 1 the eigenvalue with largest modulus keeps invariant. Shifted Power method 1. Set up initial u 0, k = 1; 2. Perform a power step y k = (A µi)u k 1 ; 3. Find the maximal component for the absolute value of a k = y k ; 4. Normalize u k = 1 a k y k and repeat. 5. λ max(a) = λ max(a µi) + µ (under suitable shift). Example 1: Shifted power method µ =?
Inverse power method How to obtain the smallest eigenvalue of A? This is closely related to computing the ground state energy E 0 for Schrödinger operator in quantum mechanics: where ψ is the wave function. ( ) 2 2µ 2 + U(r) ψ = E 0ψ Inverse power method: applying power method to A 1. The inverse of the largest eigenvalue (modulus) of A 1 corresponds to the smallest eigenvalue of A. Just change the step y k = Au k 1 in power method into Ay k = u k 1 Compute the smallest eigenvalue of Example 2 (n=30).
Inverse power method Sometimes inverse power method is cooperated with shifting to obtain the eigenvalue and eigenvector corresponding to some λ if we already have an approximate λ λ, then the power step Notice since λ λ, we have (A λi)y k = u k 1 λ max(a λi) = The convergence will be very fast. 1 λ λ 1 Compute the eigenvalue closest to 0.000 for Example 2 (n=30).
Rayleigh quotient accelerating When do we need Rayleigh quotient accelerating? If A is symmetric and we already have an approximate eigenvector u 0, we want to refine this eigenvector and corresponding eigenvalue λ. Rayleigh quotient iteration: (Inverse power method + shift) 1. Choose initial u 0, k = 1; 2. Compute Rayleigh quotient µ k = R A (u k 1 ); 3. Solve equation for u k, (A µ k I)y k = u k 1 ; 4. Normalize u k = 1 y k y k and repeat. Remark on Rayleigh quotient iteration and inverse power method.
Outline Review Power method QR method
QR method Suppose A R n n, then QR method is to apply iterations as follows A m 1 = Q m R m A m = R mq m where Q m is a orthogonal matrix, R m is an upper triangular matrix. Finally R m will tend to λ 1. λ.. 2... We find all of the eigenvalues of A! λ n. How to find matrix Q and R efficiently to perform QR factorization?
Simplest example Vector ( ) 3 x = 4 Try to eliminate the second component of x to 0. Define y = Qx, ( ) ( 0.6 0.8 5 Q =, y = 0.8 0.6 0 ),
Givens transformation Suppose ( a ) x = b Define rotation matrix ( c s ) G = s c a where c = a = cos θ, s = b = sin θ. It s quite clear that G is 2 +b a 2 2 +b2 a orthogonal matrix. We have Gx = y = ( a2 + b 2 This rotation is called Givens transformation. 0 )
Givens transformation Geometrical interpretation of Givens transformation y y x x θ x
General Givens transformation Define Givens matrix 1 G(i, k; θ) =... c s.. s c... 1 i-th row k-th row where c = cos θ, s = sin θ. Geometrical interpretation: Rotation with θ angle in i k plane.
Properties of Givens transformation Suppose the vector x = (x 1,..., x n) and we want to eliminate x k to 0 with x i. Define c = x i x k, s = x 2 i + x 2 k x 2 i + x 2 k and y = G(i, k; θ)x, then we have y i = x 2 i + x2 k, y k = 0
Householder transformation Definition. Suppose w R n and w 2 = 1, define H R n n as H = I 2ww T. H is called a Householder transformation. Properties of Householder transformation 1. Symmetric H T = H; 2. Orthogonal H T H = I; 3. Reflection (Go on to the next page! :-) )
Householder transformation For any x R n, Hx = x 2(w T x)w which is the mirror image of x w.r.t. the plane perpendicular to w. Geometrical interpretation ω ω
Application of Householder transformation For arbitrary x R n, there exists w such that Hx = αe 1 where α = ± x 2. Taking is OK. w = Proof: Define β = x αe 1 2, then x αe1 x αe 1 2 Hx = x 2(w T x)w = x 2 β 2 (α2 αe T 1 x)(x αe 1) = 2 x 2α 2 2αe T 1 x (α2 αe T 1 x)(x αe 1) = x (x αe 1) = αe 1
Application of Householder transformation If define x = (x 2,..., x n) T, there exists H R (n 1) (n 1) such that Define H = H x = αe 1 ( 1 0 0 H Then we have the last n 2 entries of Hx will be 0. i.e. Hx = (x 1, x 2 2 +... + x2 n, 0,..., 0) )
Upper Hessenberg form and QR method Upper Hessenberg form Upper Hessenberg matrix A with entry a ij = 0, j i 2, i.e. with the following form a 11 a 12 a 1n a 21 a 22 a 2n......... a n 1,n a nn
Why upper Hessenberg form Why take upper Hessenberg form? It can be proved that if A m 1 is in upper Hessenberg form then A m 1 = Q m R m, A m = R mq m. A m will be in upper Hessenberg form, too. The computational effort for QR factorization of upper Hessenberg form will be small. Example 3: A QR-factorization step for matrix 3 1 4 2 4 3 0 3 5
QR method for upper Hessenberg form How to transform upper Hessenberg form into QR form? The approach is to apply Givens transformation to A column by column to eliminate the sub-diagonal entries. Suppose d 1 b 1 d 2 A =......... b n 1 Apply Givens transformation G(1, 2; θ 1), where cos θ 1 = d 1, d 2 1 +b 2 1 sin θ 1 = b 1, then we have d 2 1 +b 2 1 d 1 A = d n 0 d 2......... b n 1 d n
QR method for upper Hessenberg form Now A = d 1 0 d 2......... b n 1 d n Apply Givens transformation G(2, 3; θ 2), where cos θ 2 = sin θ 2 = b 2 d 2 2 +b 2 2. We would zero out the entry a 32. Applying this procedure successively, we obtain d 1 0 d 2 A = R =......... d 2 d 2 2 +b 2 2, 0 d n
Transformation to upper Hessenberg form How to transform a matrix into upper Hessenberg form? The approach is to apply Householder transformation to A column by column. a 11 a 12 a 1n A = a 21 a 22 a 2n a n1 a n2 a nn Suitably choose Householder matrix H 1 such that H 1 a 11 a 21 a 31. a n1 = a 11 a 21 0. 0
Transformation to upper Hessenberg form Now we have A 1 = H 1AH 1 = a 11 a 12 a 1n a 21 a 22 a 2n 0 a n2 a nn Suitably choose Householder matrix H 2 such that H 2 a 12 a 22 a 32 a 42. a n2 = a 12 a 22 a 32 0. 0
Transformation to upper Hessenberg form Apply the Householder transformation A 2 = H 2A 1H 2,... successively, we will have the upper Hessenberg form a 11 a 12 a 1n a 21 a 22 a 2n B =......... a n 1,n a nn B has the same eigenvalues as A because of similarity transformation.
Transformation to upper Hessenberg form Compute all the eigenvalues of Example 2 (second order ODE, n=5) with QR method.
Homework assignment 4 1. Compute all the eigenvalues of Example 2 (second order ODE, n=20) with QR method.