Chapter 7. Lanczos Methods. 7.1 The Lanczos Algorithm
|
|
- Juliana Houston
- 5 years ago
- Views:
Transcription
1 Chapter 7 Lanczos Methods In this chapter we develop the Lanczos method, a technique that is applicable to large sparse, symmetric eigenproblems The method involves tridiagonalizing the given matrix A However, unlike the Householder approach, no intermediate (an full) submatrices are generated Equally important, information about A s extremal eigenvalues tends to emerge long before the tridiagonalization is complete This makes the Lanczos algorithm particularly useful in situations where a few of A s largest or smallest eigenvalues are desired 71 The Lanczos Algorithm Suppose A R n n is large, sparse and symmetric There exists an orthogonal matrix Q, which transforms A to a tridiagonal matrix T Q T AQ = T tridiagonal (711) Remark 711 (a) Such Q can be generated by Householder transformations or Givens rotations (b) Almost for all A (ie all eigenvalues are distinct) and almost for any q 1 R n with q 1 2 = 1, there exists an orthogonal matrix Q with first column q 1 satisfying (711) q 1 determines T uniquely up to the sign of the columns (that is, we can multiply each column with -1) Let (x R n ) K[x, A, m] is called a Krylov-matrix Let K[x, A, m] = [x, Ax, A 2 x,, A m 1 x] R n m (712) K(x, A, m) = Range(K[x, A, m]) = Span(x, Ax,, A m 1 x) (713) K(x, A, m) is called the Krylov-subspace generated by K[x, A, m] Remark 712 For each H C n m or R n m (m n) with rank(h) = m, there exists an Q C n m or R n m and an upper triangular R C m m or R m m with Q Q = I m such that H = QR (714)
2 262 Chapter 7 Lanczos Methods Q is uniquely determined, if we require all r ii > 0 Theorem 711 Let A be symmetric (Hermitian), 1 m n be given and dimk(x, A, m) = m then (a) If K[x, A, m] = Q m R m (715) is an QR factorization, then Q maq m = T m is an m m tridiagonal matrix and satisfies AQ m = Q m T m + r m e T m, Q mr m = 0 (716) (b) Let x 2 = 1 If Q m C n m with the first column x and Q mq m = I m and satisfies where T m is tridiagonal, then AQ m = Q m T m + r m e T m, K[x, A, m] = [x, Ax,, A m 1 x] = Q m [e 1, T m e 1,, T m 1 m e 1 ] (717) is an QR factorization of K[x, A, m] Proof: (a) Since From (715), we have AK(x, A, j) K(x, A, j + 1), j < m (718) Span(q 1,, q i ) = K(x, A, i), i m (719) So we have This implies That is q i+1 K(x, A, i) (718) AK(x, A, i 1) = A(span(q 1,, q i 1 )) q i+1aq j = 0, j = 1,, i 1, i + 1 m (Q maq m ) ij = (T m ) ij = q i Aq j = 0 for i > j + 1 So T m is upper Hessenberg and then tridiagonal (since T m is Hermitian) It remains to show (716) Since [x, Ax,, A m 1 x] = Q m R m and AK[x, A, m] = K[x, A, m] Am xe T m, 0 1 0
3 71 The Lanczos Algorithm 263 we have AQ m R m = Q m R m + Q mq ma m xe T m + (I Q m Q m)a m xe T m Then 0 0 AQ m = 1 Q m [R m + Q ma m xe T m]rm 1 + (I Q m Q m)a m xe T mrm 1 = Q m [R m R 1 m + γq ma m xe T m] + γ(i Q m Q m)a m x e T m }{{} r m = Q m H m + r m e T m with Q mr m = 0, where H m is an upper Hessenberg matrix But Q maq m = H m is Hermitian, so H m = T m is tridiagonal (b) We check (717): x = Q m e 1 coincides the first column Suppose that i-th columns are equal, ie A i 1 x = Q m Tm i 1 e 1 A i x = AQ m Tm i 1 e 1 = (Q m T m + r m e T m)tm i 1 e 1 = Q m Tme i 1 + r m e T mtm i 1 e 1 But e T mtm i 1 e 1 = 0 for i < m Therefore, A i x = Q m T i e 1 the (i + 1)-th columns are equal It is clearly that (e 1, T m e 1,, Tm m 1 e 1 ) is an upper triangular matrix Theorem 711 If x = q 1 with q 1 2 = 1 satisfies rank(k[x, A, n]) = n (that is {x, Ax,, A n 1 x} are linearly independent), then there exists an unitary matrix Q with first column q 1 such that Q AQ = T is tridiagonal Proof: From Theorem 711(a) m = n, we have Q m = Q unitary and AQ = QT Uniqueness: Let Q AQ = T, Q A Q = T and Q 1 e 1 = Qe 1 K[q 1, A, n] = QR = Q = QD, R = D R Substitute Q by QD, where D = diag(ɛ 1,, ɛ n ) with ɛ i = 1 Then Q R (QD) A(QD) = D Q AQD = D T D = tridiagonal
4 264 Chapter 7 Lanczos Methods So Q is unique up to multiplying the columns of Q by a factor ɛ with ɛ = 1 In the following paragraph we will investigate the Lanczos algorithm for the real case, ie, A R n n How to find an orthogonal matrix Q = (q 1,, q n ) with Q T Q = I n such that Q T AQ = T = tridiagonal and Q is almost uniquely determined Let AQ = QT, (7110) where Q = [q 1,, q n ] and T = α 1 β 1 0 β 1 βn 1 It implies that the j-th column of (7110) forms: 0 β n 1 α n Aq j = β j 1 q j 1 + α j q j + β j q j+1, (7111) for j = 1,, n with β 0 = β n = 0 By multiplying (7111) by q T j we obtain Define r j = (A α j I)q j β j 1 q j 1 Then q T j Aq j = α j (7112) r j = β j q j+1 with β j = ± r j 2 (7113) and if β j 0 then q j+1 = r j /β j (7114) So we can determine the unknown α j, β j, q j in the following order: Given q 1, α 1, r 1, β 1, q 2, α 2, r 2 β 2, q 3, The above formula define the Lanczos iterations: j = 0, r 0 = q 1, β 0 = 1, q 0 = 0 Do while (β j 0) q j+1 = r j /β j, j := j + 1 α j = q T j Aq j, r j = (A α j I)q j β j 1 q j 1, β j = r j 2 (7115) There is no loss of generality in choosing the β j to be positive The q j are called Lanczos vectors With careful overwriting and use of the formula α j = q T j (Aq j β j 1 q j 1 ), the whole process can be implemented with only a pair of n-vectors
5 71 The Lanczos Algorithm 265 Algorithm 711 (Lanczos Algorithm) Given a symmetric A R n n and w R n having unit 2-norm The following algorithm computes a j j symmetric tridiagonal matrix T j with the property that σ(t j ) σ(a) The diagonal and subdiagonal elements of T j are stored in α 1,, α j and β 1,, β j 1 respectively v i := 0 (i = 1,, n) β 0 := 1 j := 0 Do while (β j 0) if (j 0), then for i = 1,, n, t := w i, w i := v i /β j, v i := β j t end for end if v := Aw + v, j := j + 1, α j := w T v, v := v α j w, β j := v 2 Remark 713 (a) If the sparity is exploited and only kn flops are involved in each call (Aw) (k n), then each Lanczos step requires about (4+k)n flops to execute (b) The iteration stops before complete tridiagonalizaton if q 1 is contained in a proper invariant subspace From the iteration (7115) we have α 1 β 1 β A(q 1,, q m ) = (q 1,, q m ) 1 r βm 1 {}} m { + (0,, 0, β m q m+1 ) }{{} r me T m This implies That is β m 1 α m β m = 0 if and only if r m = 0 A(q 1,, q m ) = (q 1,, q m )T m Range(q 1,, q m ) = Range(K[q 1, A, m]) is the invariant subspace of A and the eigenvalues of T m are the eigenvalues of A Theorem 712 Let A be symmetric and q 1 be a given vector with q 1 2 = 1 The Lanczos iterations (7115) runs until j = m where m = rank[q 1, Aq 1,, A n 1 q 1 ] Moreover, for j = 1,, m we have AQ j = Q j T j + r j e T j (7116) with α 1 β 1 β T j = 1 β j 1 and Q j = [q 1,, q j ] β j 1 α j
6 266 Chapter 7 Lanczos Methods has orthonormal columns satisfying Range(Q j ) = K(q 1, A, j) Proof: By induction on j Suppose the iteration has produced Q j = [q 1,, q j ] such that Range(Q j ) = K(q 1, A, j) and Q T j Q j = I j It is easy to see from (7115) that (7116) holds Thus Q T j AQ j = T j + Q T j r j e T j Since α i = q T i Aq i for i = 1,, j and q T i+1aq i = q T i+1(β i q i+1 + α i q i + β i 1 q i 1 ) = q T i+1(β i q i+1 ) = β i for i = 1,, j 1 we have Q T j AQ j = T j Consequently Q T j r j = 0 If r j 0 then q j+1 = r j / r j 2 is orthogonal to q 1,, q j and q j+1 Span{Aq j, q j, q j 1 } K(q 1, A, j + 1) Thus Q T j+1q j+1 = I j+1 and Range(Q j+1 ) = K(q 1, A, j + 1) On the other hand, if r j = 0, then AQ j = Q j T j This says that Range(Q j ) = K(q 1, A, j) is invariant From this we conclude that j = m = dim[k(q 1, A, n)] Encountering a zero β j in the Lanczos iteration is a welcome event in that it signals the computation of an exact invariant subspace However an exactly zero or even small β j is rarely in practice Consequently, other explanations for the convergence of T js eigenvalues must be sought Theorem 713 Suppose that j steps of the Lanczos algorithm have been performed and that S T j T j S j = diag(θ 1,, θ j ) is the Schur decomposition of the tridiagonal matrix T j, if Y j R n j is defined by then for i = 1,, j we have where S j = (s pq ) Y j = [y 1,, y j ] = Q j S j Ay i θ i y i 2 = β j s ji Proof: Post-multiplying (7116) by S j gives ie, AY j = Y j diag(θ 1,, θ j ) + r j e T j S j, Ay i = θ i y i + r j (e T j S j e i ), i = 1,, j The proof is complete by taking norms and recalling r j 2 = β j Remark 714 The theorem provides error bounds for T js eigenvalues: min θ i µ β j s ji i = 1,, j µ σ(a)
7 71 The Lanczos Algorithm 267 Note that in section 10 the (θ i, y i ) are Ritz pairs for the subspace R(Q j ) If we use the Lanczos method to compute AQ j = Q j T j + r j e T j and set E = τww T where τ = ±1 and w = aq j + br j, then it can be shown that (A + E)Q j = Q j (T j + τa 2 e j e T j ) + (1 + τab)r j e T j If 0 = 1 + τab, then the eigenvalues of the tridiagonal matrix T j = T j + τa 2 e j e T j are also eigenvalues of A+E We may then conclude from theorem 612 that the interval [λ i (T j ), λ i 1 (T j )] where i = 2,, j, each contains an eigenvalue of A + E Suppose we have an approximate eigenvalue λ of A One possibility is to choose τa 2 so that det( T j λi j ) = (α j + τa 2 λ)p j 1 ( λ) β 2 j 1p j 2 ( λ) = 0, where the polynomial p i (x) = det(t i xi i ) can be evaluated at λ using (53) The following theorems are known as the Kaniel-Paige theory for the estimation of eigenvalues which obtained via the Lanczos algorithm Theorem 714 Let A be n n symmetric matrix with eigenvalues λ 1 λ n and corresponding orthonormal eigenvectors z 1,, z n If θ 1 θ j are the eigenvalues of T j obtained after j steps of the Lanczos iteration, then λ 1 θ 1 λ 1 (λ 1 λ n ) tan (φ 1 ) 2 [c j 1 (1 + 2ρ 1 )] 2, where cos φ 1 = q T 1 z 1, ρ 1 = (λ 1 λ 2 )/(λ 2 λ n ) and c j 1 is the Chebychev polynomial of degree j 1 Proof: From Courant-Fischer theorem we have θ 1 = max y 0 y T T j y y T y = max y 0 (Q j y) T A(Q j y) (Q j y) T (Q j y) = max w T Aw 0 w K(q 1,A,j) w T w Since λ 1 is the maximum of w T Aw/w T w over all nonzero w, it follows that λ 1 θ 1 To obtain the lower bound for θ 1, note that θ 1 = max p P j 1 q T 1 p(a)ap(a)q 1 q T 1 p(a) 2 q 1, where P j 1 is the set of all j 1 degree polynomials If q 1 = n d i z i then q T 1 p(a)ap(a)q 1 q T 1 p(a) 2 q 1 = n d2 i p(λ i ) 2 λ i n d2 i p(λ i) 2
8 268 Chapter 7 Lanczos Methods n i=2 λ 1 (λ 1 λ n ) d2 i p(λ i ) 2 d 2 1p(λ 1 ) 2 + n i=2 d2 i p(λ i) 2 We can make the lower bound tight by selecting a polynomial p(x) that is large at x = λ 1 in comparison to its value at the remaining eigenvalues Set p(x) = c j 1 [ x λ n λ 2 λ n ], where c j 1 (z) is the (j 1)-th Chebychev polynomial generated by c j (z) = 2zc j 1 (z) c j 2 (z), c 0 = 1, c 1 = z These polynomials are bounded by unity on [-1,1] It follows that p(λ i ) is bounded by unity for i = 2,, n while p(λ 1 ) = c j 1 (1 + 2ρ 1 ) Thus, θ 1 λ 1 (λ 1 λ n ) (1 d2 1) d c 2 j 1 (1 + 2ρ 1) The desired lower bound is obtained by noting that tan (φ 1 ) 2 = (1 d 2 1)/d 2 1 Corollary 715 Using the same notation as Theorem 714 λ n θ j λ n + (λ 1 λ n ) tan 2 (φ n ) c 2 j 1 (1 + 2ρ, n) where ρ n = (λ n 1 λ n )/(λ 1 λ n 1 ) and cos (φ n ) = q T 1 z n Proof: Apply Theorem 714 with A replaced by A Example 711 L j 1 R j 1 = ( λ 2 λ 1 ) 2(j 1) 1 [C j 1 (2 λ 1 λ 2 1)] 1 2 [C j 1 (1 + 2ρ 1 )] 2 power method λ 1 /λ 2 j=5 j= / / L j 1 /R j / / L j 1 /R j 1 Rounding errors greatly affect the behavior of algorithm 711, the Lanczos iteration The basic difficulty is caused by loss of orthogonality among the Lanczos vectors To avoid these difficulties we can reorthogonalize the Lanczos vectors
9 71 The Lanczos Algorithm Reorthogonalization Since let AQ j = Q j T j + r j e T j, AQ j Q j T j = r j e T j + F j (7117) I Q T j Q j = C T j + j + C j, (7118) where C j is strictly upper triangular and j is diagonal (For simplicity, suppose (C j ) i,i+1 = 0 and i = 0) Definition 711 θ i and y i Q j s i are called Ritz value and Ritz vector, respectively, if T j s i = θ i s i Let Θ j diag(θ 1,, θ j ) = S T j T j S j where S j = [ s 1 s j ] Theorem 716 (Paige Theorem) Assume that (a) S j and Θ j are exact! (Since j n) (b) local orthogonality is maintained ( ie q T i+1q i = 0, i = 1,, j 1, r T j q j = 0, and (C j ) i,i+1 = 0 ) Let Then F T j Q j Q T j F j = K j K T j, j T j T j j N j N T j, G j = S T j (K j + N j )S j (r ik ) (a) y T i q j+1 = r ii /β ji, where y i = Q j s i, β ji = β j s ji (b) For i k, (θ i θ k )y T i y k = r ii ( s jk s ji ) r kk ( s ji s jk ) (r ik r ki ) (7119) Proof: Multiplied (7117) from left by Q T j, we get which implies that Subtracted (7120) from (7121), we have Q T j AQ j Q T j Q j T j = Q T j r j e T j + Q T j F j, (7120) Q T j A T Q j T j Q T j Q j = e j r T j Q j + F T j Q j (7121) (Q T j γ j )e T j e j (Q T j γ j ) T = (C T j T j T j C T j ) + (C j T j T j C j ) + ( j T j T j j ) + F T j Q j Q j F T j = (C T j T j T j C T j ) + (C j T j T j C j ) + (N j N T j ) + (K j K T j )
10 270 Chapter 7 Lanczos Methods This implies that Thus, (Q T j r j )e T j = C j T j T j C j + N j + K j which implies that y T i q j+1 β ji = s T i (Q T j r j )e T j s i = s T i (C j T j T j C j )s i + s T i (N j + K j )s i = (s T i C j s i )θ i θ i (s T i C j s i ) + r ii, y T i q j+1 = r ii β ji Similarly, (7119) can be obtained by multiplying (7120) from left and right by s T i s i, respectively and Remark 715 Since y T i q j+1 = r ii β ji = { O(esp), if βji = O(1), (not converge!) O(1), if β ji = O(esp), (converge for (θ j, y j )) we have that q T j+1y i = O(1) when the algorithm converges, ie, q j+1 is not orthogonal to < Q j > where Q j s i = y i (i) Full Reorthogonalization by MGS: Orthogonalize q j+1 to all q 1,, q j by q j+1 := q j+1 j (qj+1q T i )q i If we incorporate the Householder computations into the Lanczos process, we can produce Lanczos vectors that are orthogonal to working accuracy: r 0 := q 1 (given unit vector) Determine P 0 = I 2v 0 v T 0 /v T 0 v 0 so that P 0 r 0 = e 1 ; α 1 := q T 1 Aq 1 ; Do j = 1,, n 1, r j := (A α j )q j β j 1 q j 1 (β 0 q 0 0), w := (P j 1 P 0 )r j, Determine P j = I 2v j v T j /v T j v j such that P j w = (w 1,, w j, β j, 0,, 0) T, q j+1 := (P 0 P j )e j+1, α j+1 := q T j+1aq j+1 This is the complete reorthogonalization Lanczos scheme
11 71 The Lanczos Algorithm 271 (ii) Selective Reorthogonalization by MGS If β ji = O( eps), (θ j, y j ) good Ritz pair Do q j+1 q 1,, q j Else not to do Reorthogonalization (iii) Restart after m-steps (Do full Reorthogonalization) (iv) Partial Reorthogonalization Do reorthogonalization with previous (eg k = 5) Lanczos vectors {q 1,, q k } For details see the books: Parlett: Symmetric Eigenvalue problem (1980) pp257 Golub & Van Loan: Matrix computation (1981) pp332 To (7119): The duplicate pairs can occur! i k, (θ i θ k ) yi T y }{{} k = O(esp) O(1), if y i = y k Q i Q k How to avoid the duplicate pairs? The answer is using the implicit Restart Lanczos algorithm: Let be a Lanczos decomposition AQ j = Q j T j + r j e T j In principle, we can keep expanding the Lanczos decomposition until the Ritz pairs have converged Unfortunately, it is limited by the amount of memory to storage of Q j Restarted the Lanczos process once j becomes so large that we cannot store Q j Implicitly restarting method Choose a new starting vector for the underlying Krylov sequence A natural choice would be a linear combination of Ritz vectors that we are interested in 712 Filter polynomials Assume A has a complete system of eigenpairs (λ i, x i ) and we are interested in the first k of these eigenpairs Expand u 1 in the form k n u 1 = γ i x i + γ i x i If p is any polynomial, we have p(a)u 1 = k γ i p(λ i )x i + i=k+1 n γ i p(λ i )x i i=k+1
12 272 Chapter 7 Lanczos Methods Choose p so that the values p(λ i ) (i = k+1,, n) are small compared to the values p(λ i ) (i = 1,, k) Then p(a)u 1 is rich in the components of the x i that we want and deficient in the ones that we do not want p is called a filter polynomial Suppose we have Ritz values µ 1,, µ m and µ k+1,, µ m are not interesting Then take p(t) = (t µ k+1 ) (t µ m ) 713 Implicitly restarted algorithm Let AQ m = Q m T m + β m q m+1 e T m (7122) be a Lanczos decomposition with order m Choose a filter polynomial p of degree m k and use the implicit restarting process to reduce the decomposition to a decomposition A Q k = Q k Tk + β k q k+1 e T k of order k with starting vector p(a)u 1 Let ν 1,, ν m be eigenvalues of T m and suppose that ν 1,, ν m k correspond to the part of the spectrum we are not interested in Then take The starting vector p(a)u 1 is equal to p(t) = (t ν 1 )(t ν 2 ) (t ν m k ) p(a)u 1 = (A ν m k I) (A ν 2 I)(A ν 1 I)u 1 = (A ν m k I) [ [(A ν 2 I) [(A ν 1 I)u 1 ]]] In the first, we construct a Lanczos decomposition with starting vector (A ν 1 I)u 1 From (7122), we have where (A ν 1 I)Q m = Q m (T m ν 1 I) + β m q m+1 e T m (7123) = Q m U 1 R 1 + β m q m+1 e T m, T m ν 1 I = U 1 R 1 is the QR factorization of T m κ 1 I Postmultiplying by U 1, we get It implies that where (Q (1) m (A ν 1 I)(Q m U 1 ) = (Q m U 1 )(R 1 U 1 ) + β m q m+1 (e T mu 1 ) AQ (1) m = Q (1) m T m (1) + β m q m+1 b (1)T m+1, Q (1) m = Q m U 1, T (1) m = R 1 U 1 + ν 1 I, b (1)T m+1 = e T mu 1 : one step of single shifted QR algorithm)
13 71 The Lanczos Algorithm 273 Remark 716 Q (1) m is orthonormal By the definition of T (1) m, we get U 1 T (1) m U T 1 = U 1 (R 1 U 1 + ν 1 I)U T 1 = U 1 R 1 + ν 1 I = T m (7124) Therefore, ν 1, ν 2,, ν m are also eigenvalues of T (1) m Since T m is tridiagonal and U 1 is the Q-factor of the QR factorization of T m ν 1 I, it implies that U 1 and T m (1) are upper Hessenberg From (7124), T m (1) is symmetric Therefore, T m (1) is also tridiagonal The vector b (1)T m+1 = e T mu 1 has the form [ b (1)T m+1 = 0 0 U (1) m 1,m U m,m (1) ie, only the last two components of b (1) m+1 are nonzero For on postmultiplying (7123) by e 1, we get (A ν 1 I)q 1 = (A ν 1 I)(Q m e 1 ) = Q (1) m R 1 e 1 = r (1) 11 q (1) 1 ] ; Since T m is unreduced, r (1) 11 is nonzero Therefore, the first column of Q (1) m multiple of (A κ 1 I)q 1 is a Repeating this process with ν 2,, ν m k, the result will be a Krylov decomposition AQ (m k) m with the following properties = Q (m k) m T m (m k) + β m q m+1 b (m k)t m+1 i Q (m k) m ii T (m k) m is orthonormal is tridiagonal iii The first k 1 components of b (m k)t m+1 are zero iv The first column of Q (m k) m is a multiple of (A ν 1 I) (A ν m k I)q 1 Corollary 711 Let ν 1,, ν m be eigenvalues of T m If the implicitly restarted QR step is performed with shifts ν 1,, ν m k, then the matrix T m (m k) has the form [ ] T (m k) m = T (m k) kk T (m k) k,m k 0 T (m k) k+1,k+1 where T (m k) k+1,k+1 is an upper triangular matrix with Ritz value ν 1,, ν m k on its diagonal,
14 274 Chapter 7 Lanczos Methods Therefore, the first k columns of the decomposition can be written in the form where Q (m k) k AQ (m k) k = Q (m k) k T (m k) kk + t k+1,k q (m k) k+1 e T k + β k u mk q m+1 e T k, consists of the first k columns of Q (m k) m submatrix of order k of T m (m k) we set then, T (m k) kk is the leading principal, and u km is from the matrix U = U 1 U m k Hence if Q k = Q (m k) k, T k = T (m k) kk, β k = t k+1,k q (m k) k+1 + β k u mk q m+1 2, 1 q k+1 = β k (t k+1,kq (m k) k+1 + β k u mk q m+1 ), A Q k = Q k Tk + β k q k+1 e T k is a Lanczos decomposition whose starting vector is proportional to (A ν 1 I) (A ν m k I)q 1 Avoid any matrix-vector multiplications in forming the new starting vector Get its Lanczos decomposition of order k for free For large n the major cost will be in computing QU 72 Approximation from a subspace Assume that A is symmetric and {(α i, z i )} n be eigenpairs of A with α 1 α 2 α n Define ρ(x) = ρ(x, A) = xt Ax x T x Algorithm 721 (Rayleigh-Ritz-Quotient procedure) Give a subspace S (m) = span{q} with Q T Q = I m ; Set H := ρ(q) = Q T AQ; Compute the p ( m) eigenpairs of H, which are of interest, say Hg i = θ i g i for i = 1,, p; Compute Ritz vectors y i = Qg i, for i = 1,, p; Check Ay i θ i y i 2 T ol, for i = 1,, p By the minimax characterization of eigenvalues, we have α j = λ j (A) = min max ρ(f, A) F j R n f F j
15 72 Approximation from a subspace 275 Define β j = min max ρ(g, A), for j m G j S m g Gj Since G j S m and S (m) = span{q}, it implies that G j = Q G j for some G j R m Therefore, β j = min G j R m max s G j ρ(s, H) = λ j (H) θ j, for j = 1,, m For any m by m matrix B, there is associated a residual matrix R(B) AQ QB Theorem 721 For given orthonormal n by m matrix Q, for all m by m matrix B Proof: Since R(H) R(B) R(B) R(B) = Q A 2 Q B (Q AQ) (Q AQ)B + B B = Q A 2 Q H 2 + (H B) (H B) = R(H) R(H) + (H B) (H B) and (H B) (H B) is positive semidefinite, it implies that R(B) 2 R(H) 2 Since we have that which implies that Hg i = θ i g i, for i = 1,, m, Q T AQg i = θ i g i, QQ T A(Qg i ) = θ i (Qg i ) Let y i = Qg i Then QQ T y i = Q(Q T Q)g i = y i Take P Q = QQ T which is a projection on span{q} Then which implies that ie, r i = Ay i θ i y i S m = span{q} (QQ T )Ay i = θ i (QQ T )y i, P Q (Ay i θ i y i ) = 0,
16 276 Chapter 7 Lanczos Methods Theorem 721 Let θ j for j = 1,, m be eigenvalues of H = Q T AQ Then there is α j σ(a) such that Furthermore, θ j α j R 2 = AQ QH 2, for j = 1,, m m (θ j α j ) 2 2 R 2 F j=1 Proof: See the detail in Chapter 11 of The symmetric eigenvalue problem, Parlett(1981) Theorem 722 Let y be a unit vector θ = ρ(y), α be an eigenvalue of A closest to θ and z be its normalized eigenvector Let r = min λi α λ i (A) θ and ψ = (y, z) Then where r(y) = Ay θy θ α r(y) 2 /r, sin ψ r(y) /r, Proof: Decompose y = z cos ψ + w sin ψ with z T w = 0 and w 2 = 1 Hence r(y) = z(α θ) cos ψ + (A θ)w sin ψ Since Az = αz and z T w = 0, we have z T (A θ)w = 0 and so r(y) 2 2 = (α θ) 2 cos 2 ψ + (A θ)w 2 2 sin 2 ψ (A θ)w 2 2 sin 2 ψ (7225) Let w = α i α ξ iz i Then (A θ)w 2 2 = w T (A θ)(a θ)w = α i α (α i θ) 2 ξ 2 i r 2 ( α i α ξ 2 i ) = r 2 Therefore, Since r(y) y, we have sin ψ r(y) 2 r which implies that From above equation, we get 0 = y T r(y) = (α θ) cos 2 ψ + w T (A θ)w sin 2 ψ, sin 2 ψ = 1 k + 1 = k cos2 ψ sin 2 ψ = wt (A θ)w θ α α θ w T (A α)w, cos2 ψ = k k + 1 = wt (A θ)w w T (A α)w (7226)
17 72 Approximation from a subspace 277 Substituting (7226) into (7225), r(y) 2 2 can be rewritten as r(y) 2 2 = (θ α)w T (A α)(a θ)w/w T (A α)w (7227) By assumption there are no eigenvalues of A separating α and θ Thus (A αi)(a θi) is positive definite and so w T (A α)(a θ)w = α i α α i α α i θ ξ 2 i r α i α α i α ξ 2 i r α i α(α i α)ξ 2 i = r w T (A α)w (7228) Substituting (7228) into (7227), the theorem s first inequality appears 100 years old and still alive : Eigenvalue problems Hank / G Gloub / Van der Vorst / A priori bounds for interior Ritz approximations Given subspace S m = span{q}, let {(θ i, y i )} m be Ritz pairs of H = Q T AQ and Az i = α i z i, i = 1,, n Lemma 723 For each j m for any unit s S m satisfying s T z i = 0, i = 1,, j 1 Then where ψ i = (y i, z i ) j 1 α j θ j ρ(s) + (α 1 θ i ) sin 2 ψ i (7229) j 1 ρ(s) + (α 1 α i ) sin 2 ψ i, Proof: Take j 1 s = t + r i y i, where t y i for i = 1,, j 1 and s 2 = 1 Assumption s T z i = 0 for i = 1,, j 1 and lead to y i z i cos ψ i 2 2 = (y i z i cos ψ i ) T (y i z i cos ψ i ) = 1 cos 2 ψ i cos 2 ψ i + cos 2 ψ i = 1 cos 2 ψ i = sin 2 ψ i r i = s T y i = s T (y i z i cos ψ) s 2 sin ψ
18 278 Chapter 7 Lanczos Methods Let (θ i, g i ) for i = 1,, m be eigenpairs of symmetric H with g T i g k = 0 for i k and y i = Qg i Then Combining (7230) with t T Ay i = 0, we get Thus 0 = g T i (Q T AQ)g k = y T i Ay k for i k (7230) j 1 ρ(s) = t T At + (yi T Ay i )ri 2 j 1 ρ(s) α 1 = t T (A α 1 )t + (θ i α 1 )ri 2 tt (A α 1 )t t T t j 1 + (θ i α 1) ri 2 j 1 ρ(t) α 1 + (θ i α 1 ) sin 2 ϕ i Note that ρ(t) θ j = min{ρ(u); u S (m), u y i, i < j} Therefore, the second inequality in (7229) appears Let ϕ ij = (z i, y j ) for i = 1,, n and j = 1,, m Then ϕ ii = ϕ i and n i=j+1 Proof: Since y T j y i = 0 for i j and we have From (7231), which implies that y j = n z i cos ϕ ij (7231) cos ϕ ij sin ϕ i (7232) j 1 cos 2 ϕ ij = sin 2 ϕ j cos 2 ϕ ij (7233) (y i cos ϕ i z i ) T (y i cos ϕ i z i ) = sin 2 ϕ i, cos ϕ ij = y T j z i = y T j (y i cos ϕ i z i ) y j 2 y i cos ϕ i z i 2 sin ϕ i 1 = (y j, y j ) = n cos 2 ϕ ij j 1 sin 2 ϕ j = 1 cos 2 ϕ jj = cos 2 ϕ ij + n cos 2 ϕ ij (7234) i=j+1
19 73 Krylov subspace 279 Lemma 724 For each j = 1,, m, Proof: By (7231), sin ϕ j [(θ j α j )+ j 1 (α j+1 α i ) sin 2 ϕ i ]/(α j+1 α j ) (7235) ρ(y j, A α j I) = θ j α j = n (α i α j ) cos 2 ϕ ij It implies that = j 1 θ j α j + (α j α i ) cos 2 ϕ ij n (α i α j ) cos 2 ϕ ij i=j+1 (α j+1 α j ) n i=j+1 cos 2 ϕ ij j 1 = (α j+1 α j )(sin 2 ϕ j cos 2 ϕ ij ) (from (7235)) Solve sin 2 ϕ j and use (9310) to obtain inequality (7235) Explanation: By Lemmas 723 and 724, we have j = 1 : θ 1 ρ(s), s T z 1 = 0 (Lemma 723) j = 1 : sin 2 ϕ 1 θ 1 α 1 α 2 α 1 ρ(s) α 1 α 2 α 1, s T z 1 = 0 (Lemma 724) j = 2 : θ 2 ρ(s) + (α 1 α 1 ) sin 2 ϕ 1 ρ(s) + (α 1 α 1 ) ρ(ξ) α 1 α 2 α 1, s T z 1 = s T z 2 = 0, ξ T z 1 = 0 (Lemma 723) j = 2 : sin 2 ϕ 2 (Lemma724) (θ 2 α 2 ) + (α 3 α 1 ) sin 2 ϕ 1 α 3 α 2 j=1,j=2 [ρ(s) + (α 1 α 1 )( ρ(t) α 1 α 2 α 1 ) α 2 ] + α 3 α 1 α 3 α 2 ( ρ(t) α 1 α 2 α 1 ) 73 Krylov subspace Definition 731 Given a nonzero vector f, K m (f) = [f, Af,, A m 1 f] is called Krylov matrix and S m = K m (f) = f, Af,, A m 1 f is called Krylov subspace which are created by Lanczos if A is symmetric or Arnoldi if A is unsymmetric
20 280 Chapter 7 Lanczos Methods Lemma 731 Let {(θ i, y i )} m be Ritz pairs of K m (f) If ω is a polynomial with degree m 1 (ie, ω P m 1 ), then ω(a)f y k if and only if ω(θ k ) = 0, k = 1,, m Proof: Let where π(ξ) P m 2 Thus and exercise! Define Corollary 732 µ(ξ) ω(ξ) = (ξ θ k )π(ξ), π(a)f K m (f) y T k ω(a)f = y T k (A θ k )π(a)f = r T k π(a)f = 0 ( r k Q = K m (f)) m (ξ θ i ) and π k (ξ) µ(ξ) (ξ θ k ) y k = π k(a)f π k (A)f Proof: Since π k (θ i ) = 0 for θ i θ k, from Lemma 731, π k (A)f y i, i k Thus, π k (A)f // y k and then y k = π k(a)f π k (A)f Lemma 733 Let h be the normalized projection of f orthogonal to Z j, Z j span(z 1,, z j ) For each π P m 1 and each j m, [ ] sin (f, Z j 2 ) π(a)h ρ(π(a)f, A α j I) (α n α j ) (7336) cos (f, Z j ) π(α j ) Proof: Let ψ = (f, Z j ) = cos 1 f Z j and let g be the normalized projection of f onto Z j so that Since Z j is invariant under A, f = g cos ψ + h sin ψ s π(a)f = π(a)g cos ψ + π(a)h sin ψ, where π(a)g Z j and π(a)h (Z j ) A little calculation yields ρ(s, A α j I) = g (A α j I)π 2 (A)g cos 2 ψ + h (A α j I)π 2 (A)h sin 2 ψ π(a)f 2 (7337) The eigenvalues of A are labeled so that α 1 α 2 α n and
21 73 Krylov subspace 281 (a) v (A α j I)v 0 for all v Z j, in particular, v = π(a)g; (b) w (A α j I)w (α n α j ) w 2 for all w (Z j ), in particular, w = π(a)h Used (a) and (b) to simplify (7337), it becomes [ ] 2 π(a)h sin ψ ρ(s, A α j I) (α n α j ) π(a)f The proof is completed by using s 2 = π(a)f 2 = n π 2 (α i ) cos 2 (f, z i ) π 2 (α j ) cos 2 (f, z j ) 731 The Error Bound of Kaniel and Saad The error bounds come from choosing π P m 1 in Lemma 733 such that (i) π(α j ) is large, while π(a)h is small as possible, and (ii) ρ(s, A α j I) 0 where s = π(a)f To (i): Note that π(a)h 2 = n i=j+1 π2 (α i ) cos 2 (f, z j ) n i=j+1 cos2 (f, z j ) max i>j π2 (α i ) max π 2 (τ) τ [α j+1,α n] Chebychev polynomial solves min π P n j max τ [αj+1,α n ] π 2 (τ) To (ii): The following facts are known: (a) 0 θ j α j, (Cauchy interlace Theorem) (b) θ j α j ρ(s, A α j I), if s y i, for all i < j, (By minimax Theorem) (c) θ j α j ρ(s, A α j I)+ j 1 (α n α i ) sin 2 (y i, z i ), if s z i, for all i < j (Lemma 723) Theorem 734 (Saad) Let θ 1 θ m be the Ritz values from K m (f) (by Lanczos or Arnoldi) and let (α i, z i ) be the eigenpairs of A For j = 1,, m, and 0 θ j α j (α n α j ) where r = (α j α j+1 )/(α j+1 α n ) [ sin (f, Z j ) j 1 k=1 ( θ k α n θ k α j ) cos (f, Z j )T m j (1 + 2r) tan (z j, K m ) sin (f, Zj ) j 1 k=1 ( α k α n α k α j ) cos (f, Z j )T m j (1 + 2r), ] 2
22 282 Chapter 7 Lanczos Methods Proof: Apply Lemmas 733 and 731 To ensure (b), it requires s y i for i = 1,, j 1 By Lemma 731, we construct By Lemma 733 for this π(ξ) : π(ξ) = (ξ θ 1 ) (ξ θ j 1 ) π(ξ), π P m j π(a)h π(α j ) = (A θ 1) (A θ j 1 ) π(a)h (α j θ 1 ) (α j θ j 1 ) π(α j ) j 1 α n α k π(τ) α n θ k max τ [α j+1,α j ] π(α j ) k=1 j 1 α n α k π(τ) α j α k min max π Pm j j π(α j ) k=1 j 1 α n α k 1 α j α k T m j (1 + 2r) (7338) k=1 since h Z j On combining (b), Lemma 733 and (7338), the first of the results is obtained To prove the second inequality: π is chosen to satisfy π(α i ) = 0 for i = 1,, j 1 so that Therefore, s = π(a)f = z j π(α j ) cos (f, z j ) + π(a)h sin ψ tan (s, z j ) = sin (f, Zj ) π(a)h, cos (f, z j ) π(α j ) where π(ξ) = (ξ α 1 ) (ξ α j 1 ) π(ξ) with π(ξ) P m j The proof is completed by choosing π by Chebychev polynomial as above Theorem 735 Let θ m θ 1 be Royleigh-Ritz values of K m (f) and Az j = α j z j for j = n,, 1 with α n α 1, then and 0 α j θ j (α j α 1 ) [ sin (f, Z j ) 1 k= j+1 ( α n θ k cos (f, z j )T m j (1 + 2r) [ tan(z j, K m ) sin (f, 1 Z j ) cos (f, z j ) T m j (1 + 2r) where r = (α j 1 α j )/(α n α j 1 ) k= j+1 ( α k α n α k α j ) ]2 α k θ j ) ] 2,,
23 74 Applications to linear Systems and Least Squares 283 Theorem 736 (Kaniel) The Rayleigh-Ritz (θ j, y j ) from K m (f) to (α j, z j ) satisfy and 0 θ j α j (α n α j ) [ sin (f, Z j ) j 1 k=1 ( α k α n ] 2 α k α j ) cos (f, z j )T m j (1 + 2r) j 1 + (α n α k ) sin 2 (y k, z k ) k=1 sin 2 (y j, z j ) (θ j α j ) + j 1 k=1 (α j+1 α k ) sin 2 (y k, z k ) α j+1 α j, where r = (α j α j+1 )/(α j+1 α n ) 74 Applications to linear Systems and Least Squares 741 Symmetric Positive Definite System Recall: Let A be symmetric positive definite and Ax = b Then x minimizes the functional φ(x) = 1 2 xt Ax b T x (741) An approximate minimizer of φ can be regarded as an approximate solution to Ax = b One way to produce a sequence {x j } that converges to x is to generate a sequence of orthonormal vectors {q j } and to let x j minimize φ over span{q 1,, q j }, where j = 1,, n Let Q j = [q 1,, q j ] Since x span{q 1,, q j } φ(x) = 1 2 yt (Q T j AQ j )y y T (Q T j b) for some y R j, it follows that where x j = Q j y j, (742) (Q T j AQ j )y j = Q T j b (743) Note that Ax n = b We now consider how this approach to solving Ax = b can be made effective when A is large and sparse There are two hurdles to overcome: (i) the linear system (743) must be easily solved; (ii) we must be able to compute x j without having to refer to q 1,, q j explicitly as (742) suggests To (i): we use Lanczos algorithm algorithm 711 to generate the q i After j steps we obtain AQ j = Q j T j + r j e T j, (744)
24 284 Chapter 7 Lanczos Methods where α 1 β 1 0 T j = Q T β j AQ j = 1 α 2 βj 1 and T jy j = Q T j b (745) 0 β j 1 α j With this approach, (743) becomes a symmetric positive definite tridiagonal system which may be solved by LDL T Cholesky decomposition, ie, where L j = µ µ j 1 Compared the entries of (746), we get Note that we need only calculate T j = L j D j L T j, (746) 1 0 d 1 0 and D j = 0 0 d j d 1 = α 1, for i = 2,, j, µ i = β i 1 /d i 1, d i = α i β i 1 µ i (747) µ j = β j 1 /d j 1 d j = α j β j 1 µ j (748) in order to obtain L j and D j from L j 1 and D j 1 To (ii): Trick: we define C j = [c 1,, c j ] R n j and p j R j by the equations and observe that It follows from (749) that and therefore C j L T j = Q j, L j D j p j = Q T j b x j = Q j T 1 j Q T j b = Q j (L j D j L T j ) 1 Q T j b = C j p j [c 1, µ 2 c 1 + c 2,, µ j c j 1 + c j ] = [q 1,, q j ], C j = [C j 1, c j ], c j = q j µ j c j 1 If we set p j = [ρ 1,, ρ j ] T in L j D j p j = Q T j b, then that equation becomes ρ 1 q1 T b [ ρ 2 q2 T b L j 1 D j µ j d j 1 d j ] = ρ j 1 ρ j q T j 1b q T j b (749)
25 74 Applications to linear Systems and Least Squares 285 Since L j 1 D j 1 p j 1 = Q T j 1b, it follows that p j = [ pj 1 ρ j ], ρ j = (q T j b µ j d j 1 ρ j 1 )/d j and thus x j = C j p j = C j 1 p j 1 + ρ j c j = x j 1 + ρ j c j This is precisely the kind of recursive formula for x j that we need Together with (748) and (749) it enables us to make the transition from (q j 1, c j 1, x j 1 ) to (q j, c j, x j ) with a minimal amount of work and storage A further simplification results if we set q 1 = b/β 0 where β 0 = b 2 For this choice of a Lanczos starting vector we see that q T i b = 0 for i = 2, 3, It follows from (744) that Ax j = AQ j y j = Q j T j y j + r j e T j y j = Q j Q T j b + r j e T j y j = b + r j e T j y j Thus, if β j = r j 2 = 0 in the Lanczos iteration, then Ax j = b Moreover, since Ax j b 2 = β j e T j y j, the iteration provides estimates of the current residual Algorithm 741 Given b R n and a symmetric positive definite A R n n The following algorithm computes x R n such that Ax = b β 0 = b 2, q 1 = b/β 0, α 1 = q T 1 Aq 1, d 1 = α 1, c 1 = q 1, x 1 = b/α 1 For j = 1,, n 1, r j = (A α j )q j β j 1 q j 1 (β 0 q 0 0), β j = r j 2, If β j = 0 then else Set x = x j and stop; q j+1 = r j /β j, α j+1 = q T j+1aq j+1, µ j+1 = β j /d j, d j+1 = α j+1 µ j+1 β j, ρ j+1 = µ j+1 d j ρ j /d j+1, c j+1 = q j+1 µ j+1 c j, x j+1 = x j + ρ j+1 c j+1, end if end for x = x n This algorithm requires one matrix-vector multiplication and 5n flops per iteration
26 286 Chapter 7 Lanczos Methods 742 Symmetric Indefinite Systems A key feature in the above development is the idea of computing LDL T Cholesky decomposition of tridiagonal T j Unfortunately, this is potentially unstable if A, and consequently T j, is not positive definite Paige and Saunders (1975) had developed the recursion for x j by an LQ decomposition of T j At the j-th step of the iteration we will Given rotations J 1,, J j 1 such that d 1 0 e 2 d 2 T j J 1 J j 1 = L j = f 3 e 3 d 3 0 f j e j d j Note that with this factorization, x j is given by x j = Q j y j = Q j T 1 j Q T j b = W j s j, where W j R n j and s j R j are defined by W j = Q j J 1 J j 1 and L j s j = Q T j b Scrutiny of these equations enables one to develop a formula for computing x j from x j 1 and an easily computed multiple of w j, the last column of W j 743 Connection of Algorithm 741 and CG method Let x L j : Iterative vector generated by Algorithm 741 x CG i : Iterative vector generated by CG method with, x CG 0 = 0 Since r CG 0 = b Ax 0 = b = p CG 0, then Claim: x CG i = x L i for i = 1, 2,, (a) CG method (A variant version): x CG 1 = α CG 0 p 0 = bt b b T Ab b = xl 1 x 0 = 0, r 0 = b, For k = 1,, n, if r k 1 = 0 then set x = x k 1 and quit else β k = r T k 1 r k 1/r T k 2 r k 2 (β 1 0), p k = r k 1 + β k p k 1 (p 1 r 0 ), α k = r T k 1 r k 1/p T k Ap k, x k = x k 1 + α k p k, r k = r k 1 α k Ap k, end if end for x = x n (7410)
27 74 Applications to linear Systems and Least Squares 287 Define R k = [r 0,, r k 1 ] R n k and 1 β B k = β k 0 1 From p j = r j 1 + β j p j 1 (j = 2,, k) and p 1 = r 0, it follows R k = P k B k Since the columns of P k = [p 1,, p k ] are A-conjugate, we see that R T k AR k = B T k diag(p T 1 Ap 1,, p T k Ap k )B k is tridiagonal Since span{p 1,, p k }=span{r 0,, r k 1 }=span{b, Ab,, A k 1 b} and r 0,, r k 1 are mutually orthogonal, it follows that if k = diag(β 0,, β k 1 ), β i = r i 2, then the columns of R k 1 k form an orthonormal basis for span{b, Ab,, A k 1 b} Consequently the columns of this matrix are essentially the Lanczos vectors of algorithm 741, ie, qi L = ±ri 1/β CG i 1 (i = 1,, k) Moreover, T k = 1 k BT k diag(p T i Ap i )B k 1 k The diagonal and subdiagonal of this matrix involve quantities that are readily available during the conjugate gradient iteration Thus, we can obtain good estimate of A s extremal eigenvalues (and condition number) as we generate the x k in (7411) p CG i = c L i constant Show that c L i are A-orthogonal Since C j L T j = Q j C j = Q j L T j, it implies that Cj T AC j = L 1 j Q T j AQ j L T j = L 1 j L j D j L T j L T = L 1 j T j L T j j = D j So {c i } j are A-orthogonal (b) It is well known that x CG j minimizes the functional φ(x) = 1 2 xt Ax b T x in the subspace span{r 0, Ar 0,, A j 1 r 0 } and x L j minimize φ(x) = 1 2 xt Ax b T x in the subspace span{q 1,, q j } We also know that K[q 1, A, j] = Q j R j which implies K(q 1, A, j) =span {q 1,, q j } But q 1 = b/ b 2, r 0 = b, so span {r 0, Ar 0,, A j 1 r 0 } = K(q 1, A, j) =span {q 1,, q j } therefore we have x CG j = x L j
28 288 Chapter 7 Lanczos Methods 744 Bidiagonalization and the SVD Suppose U T AV = B the bidiagonalization of A R m n and that U = [u 1,, u m ], U T U = I m, V = [v 1,, v n ], V T V = I n, (7411) and B = α 1 β 1 0 β n 1 0 α n 0 0 (7412) Recall that this decomposition serves as a front end for the SV D algorithm Unfortunately, if A is large and sparse, then we can expect large, dense submatrices to arise during the Householder transformation for the bidiagonalization It would be nice to develop a method for computing B directly without any orthogonal update of the matrix A We compare columns in the equations AV = UB and A T U = V B T : Av j = α j u j + β j 1 u j 1, β 0 u 0 0, A T u j = α j v j + β j v j+1, β n v n+1 0, for j = 1,, n Define We may conclude that r j = Av j β j 1 u j 1 and p j = A T u j α j v j α j = ± r j 2, u j = r j /α j, v j+1 = p j /β j, β j = ± p j 2 These equations define the Lanczos method for bidiagonaling a rectangular matrix (by Paige (1974)): Given v 1 R n with unit 2-norm r 1 = Av 1, α 1 = r 1 2 For j = 1,, n, If α j = 0 then stop; else u j = r j /α j, p j = A T u j α j v j, β j = p j 2, If β j = 0 then stop; else v j+1 = p j /β j, r j+1 = Av j+1 β j u j, α j+1 = r j+1 2 end if end if end for (7413)
29 74 Applications to linear Systems and Least Squares 289 It is essentially equivalent [ to applying ] the Lanczos tridiagonalization scheme to the symmetric matrix C = 0 A A T We know that 0 λ i (C) = σ i (A) = λ n+m i+1 (C) for i = 1,, n Because of this, the large singular values of the bidiagonal matrix α 1 β 1 0 B j = tend to be very good approximations to the large singular βj 1 0 α j values of A 745 Least square problems As detailed in chapter III the full-rank LS problem min Ax b 2 can be solved by the bidiagonalization (7411)-(7412) In particular, n x LS = V y LS = a i v i, where y = (a 1,, a n ) T solves the bidiagonal system By = (u T 1 b,, u T nb) T Disadvantage: Note that because B is upper bidiagonal, we cannot solve for y until the bidiagonalization is complete We are required to save the vectors v 1,, v n an unhappy circumstance if n is very large Modification: It can be accomplished more favorably if A is reduced to lower bidiagonal form: α 1 0 β 1 α 2 U T AV = B =, m n + 1, α n 0 β n 0 0 where V = [v 1,, v n ] and U = [u 1,, u m ] It is straightforward to develop a Lanczos procedure which is very similar to (7413) Let V j = [v 1,, v j ], U j = [u 1,, u j ] and α 1 0 β 1 α 2 B j = R (j+1) j αj 0 β j and consider minimizing Ax b 2 over all vectors of the form x = V j y, y R j Since m AV j y b 2 = U T AV j y U T b 2 = B j y Uj+1b T 2 + (u T i b) 2, i=j+2
30 290 Chapter 7 Lanczos Methods it follows that x j = V j y j is the minimizer of the LS problem over span{v j }, where y j minimizes the (j + 1) j LS problem min B j y U T j+1b 2 Since B j is lower bidiagonal, it is easy to compute Jacobi rotations J 1,, J j such that J j J 1 B j = is upper bidiagonal Let J j J 1 U T j+1b = So y j = R 1 j [ dj u [ Rj 0 ] ], then [ B j y Uj+1b T 2 = J j J 1 y J j J 1 Uj+1b T Rj 2 = 0 d j, x j = V j y j = V j R 1 d j = W j d j Let j W j = (W j 1, w j ), w j = (v j w j 1 r j 1,j )/r jj ] [ dj y u ] 2 where [ r j 1,j ] and r jj are elements of R j R j can be computed from R j 1 Similarly, dj 1 d j =, x j can be obtained from x j 1 : Thus δ j [ dj 1 x j = W j d j = (W j 1, w j ) δ j For details see Paige-Saunders (1978) x j = x j 1 + w j δ j ] = W j 1 d j 1 + w j δ j 746 Error Estimation of least square problems Continuity of A + of the function: R m n R m n defined by A A + Lemma 742 If {A i } converges to A and rank(a i ) = rank(a) = n, then {A + i } also converges to A + Proof: Since lim i A T i A i = A T A nonsingular, so A + i = (A T i A i ) 1 A T i i (A T A) 1 A T = A + Example 741 Let A ε = ε 0 0 ε 0, rank(a 0 ) < 2 But A + ε = with ε > 0 and A 0 = [ /ε 0 ] A + 0 = [ , then A ε A 0 as ] as ε 0
31 74 Applications to linear Systems and Least Squares 291 Theorem 743 Let A, B R m n, then holds Without proof A + B + F 2 A B F max{ A + 2 2, B + 2 2} Remark 741 It does not follow that A B implies A + B + Because A + can diverges to, see example Theorem 744 If rank(a) = rank(b) then A + B + F µ A + 2 B + 2 A B F, where µ = { 2, if rank(a) < min(m, n) 1, if rank(a) = min(m, n) Pseudo-Inverse of A: A + is the unique solution of equations A + AA + = A +, (AA + ) = AA +, AA + A = A, (A + A) = A + A P A = AA + is Hermitian P A is idempotent, and R(P A ) = R(A) P A is the orthogonal projection onto R(A) Similarly, R(A) = A + A is the projection onto R(A ) Furthermore, ρ 2 LS = b AA + b 2 2 = (I AA + )b 2 2 Lemma 741 (Banach Lemma) B 1 A 1 A B A 1 B 1 Proof: From ((A + δa) 1 A 1 )(A + δa) = I I A 1 δa, follows lemma immediately Theorem 745 (i) The product P B P A can be written in the form P B PA = (B + ) R B E PA, where PA = I P A, B = A + E Thus P B PA B+ 2 E (ii) If rank(a) = rank(b), then P B PA min{ B+ 2, A + 2 } E Proof: P B P A = P BP A = (B + ) B P A = (B + ) (A + E) P A = (B + ) E P A = (B + ) B (B + ) E P A = (B + ) R B E P A ( R B 1, P A 1) Part (ii) follows from the fact that rank(a) rank(b) P B P A P B P A Exercise! (Using C-S decomposition)
32 292 Chapter 7 Lanczos Methods Theorem 746 It holds Proof: F 1 {}}{{}}{{}}{ B + A + = B + P B ER A A + + B + P B PA RBR A A + B + A + = B + P B ER A A + + (B B) + R B E P A R BE P A (AA ) + B + BB + (B A)A + AA + + B + BB + (I AA + ) (I B + B)(A + A)A + = B + (B A)A + + B + (I AA + ) (I B + B)A + = B + A + (Substitute P B = BB +, E = B A, R A = AA +, ) F 2 F 3 Theorem 747 If B = A + E, then B + A + F 2 E F max{ A + 2 2, B + 2 2} Proof: Suppose rank(b) rank(a) Then the column spaces of F 1 and F 2 are orthogonal to the column space of F 3 Hence B + A + 2 F = F 1 + F 2 2 F + F 3 2 F ((I B + B)B + = 0) Since F 1 + F 2 = B + (P B EA + P A + P B PA ), we have F 1 + F 2 2 F B + 2 2( P B EA + P A 2 F + P B P A 2 F ) By Theorems 745 and 746 follows that Thus P B EA + P A 2 F + P B PA 2 F P B EA + 2 F + PB P A 2 F = P B EA + 2 F + PB EA + 2 F = EA + 2 F E 2 F A F 1 + F 2 F A + 2 B + 2 E F (P B P A = P B ER A A + = P B EA + ) By Theorem 746 we have F 3 F A + 2 R BR A F = A + 2 R A R B F = A + 2 A + ER B A E F The final bound is symmetric in A and B, it also holds when rank(b) rank(a) Theorem 748 If rank(a) = rank(b), then B + A + F 2 A + 2 B + 2 E F (see Wedin (1973)) From above we have B + A + F B + 2 2k 2 (A) E F A 2 This bound implies that as E approaches zero, the relative error in B + approaches zero, which further implies that B + approach A + Corollary 749 lim B A B + = A + rank(a) = rank(b) as B approaches A (See Stewart 1977)
33 74 Applications to linear Systems and Least Squares Perturbation of solutions of the least square problems We first state two corollaries of Theorem (SVD) Theorem 7410 (SVD) If A R m n then there exists orthogonal matrices U = [u 1,, u m ] R m m and V = [v 1,, v n ] R n n such that U T AV = diag(σ 1,, σ p ) where p = min(m, n) and σ 1 σ 2 σ p 0 Corollary 7411 If the SVD is given by Theorem 7410 and σ 1 σ r > σ r+1 = = σ p = 0, then (a) rank(a) = r (b) N (A) =span{v r+1,, v n } (c) Range(A) =span{u 1,, u r } (d) A = Σ r σ i u i vi T = U r Σ r Vr T, where U r = [u 1,, u r ], V r = [v 1,, v r ] and Σ r = diag(σ 1,, σ r ) (e) A 2 F = σ σ 2 r (f) A 2 = σ 1 Proof: exercise! Corollary 7412 Let SVD of A R m n is given by Theorem 7410 If k < r = rank(a) and A k = Σ k σ i u i vi T, then min A X 2 = A A k 2 = σ k+1 (7414) rank(x)=k,x R m n Proof: Let X R m n with rank(x) = k Let τ 1,, τ n with τ 1 τ n 0 be the singular values of X Since A = X + (A X) and τ k+1 = 0, then σ k+1 = τ k+1 σ k+1 A X 2 For the matrix A k = U ΣV T ( Σ = diag(σ 1,, σ k, 0,, 0)) we have A A k 2 = U(Σ Σ)V T 2 = Σ Σ 2 = σ k+1 LS-problem: Ax b 2 =min! x LS = A + b Perturbated LS-problem: (A + E)y (b + f) 2 = min! y = (A + E) + (b + f) Lemma 7413 Let A, E R m n and rank(a) = r (a) If rank(a + E) > r then holds (A + E) E 2 (b) If rank(a + E) r and A + 2 E 2 < 1 then rank(a + E) = r and (A + E) + 2 A A + 2 E 2
34 294 Chapter 7 Lanczos Methods Proof: Let τ 1 τ n be the singular values of A + E To (a): If τ k is the smallest nonzero singular value, then k r + 1 because of rank(a + E) > r By Corollary 746, we have E 2 = (A + E) A 2 τ r+1 τ k and therefore (A + E) + 2 = 1/τ k 1/ E 2 To (b): Let σ 1 σ n be the singular values of A, then σ r 0 because of rank(a) = r and A + 2 = 1/σ r Since A + 2 E 2 < 1 so E 2 < σ r, and then by Corollary 746 it must be rank(a + E) r, so we have rank(a + E) = r By Weyl s theorem (Theorem 615) we have τ r σ r E 2 and furthermore here σ r E 2 > 0, so one obtains (A + E) + 2 = 1/τ r 1/(σ r E 2 ) = A + 2 /(1 A + 2 E 2 ) Lemma 7414 Let A, E R m n, b, f R m and x = A + b, y = (A + E) + (b + f) and r = b Ax, then holds y x = [ (A + E) + EA + + (A + E) + (I AA + ) +(I (A + E) + (A + E)A + ]b + (A + E) + f = (A + E) + Ex + (A + E) + (A + E) +T E T r +(I (A + E) + (A + E))E T A +T x + (A + E) + f Proof: y x = [(A + E) + A + ]b + (A + E) + f and for (A + E) + A + one has the decomposition (A + E) + A + = (A + E) + EA + + (A + E) + A + +(A + E) + (A + E A)A + = (A + E) + EA + + (A + E) + (I AA + ) (I (A + E) + (A + E))A + Let C := A + E and apply the generalized inverse to C we obtain C + = C + CC + = C + C +T C + and A T (I AA + ) = A T A T AA + = A T A T A +T A T = A T A T A T + A T = 0, also A + = A T A +T A + and (I C + C)C T = 0 Hence it holds and C + (I AA + ) = C + C +T E T (I AA + ) (I C + C)A + = (I C + C)E T A +T A + If we substitute this into the second and third terms in the decomposition of (A+E) + A + then we have the result (r = (I AA + )b, x = A + b): y x = [ (A + E) + EA + + (A + E) + (A + E) +T E T (I AA T ) +(I (A + E) + (A + E))E T A +T A + ]b + (A + E) + f = (A + E) + Ex + (A + E) + (A + E) +T E T r +(I (A + E) + (A + E))E T A +T x + (A + E) + f
35 75 Unsymmetric Lanczos Method 295 Theorem 7415 Let A, E R m n, b, f R m, and x = A + b 0, y = (A + E) + (b + f) and r = b Ax If rank(a) = r, rank(a + E) r and A + 2 E 2 < 1, then holds y x 2 x 2 A 2 A A + 2 E 2 Proof: From Lemma 7414 follows [ 2 E 2 A 2 + A A + 2 E 2 E 2 A 2 r 2 x 2 + f 2 A 2 x 2 y x 2 (A + E) + 2 [ E 2 x 2 + (A + E) + 2 E 2 r 2 + f 2 ] + I (A + E) + (A + E) 2 E 2 A + 2 x 2 Since I (A + E) + (A + E) is symmetric and it holds (I (A + E) + (A + E)) 2 = I (A + E) + (A + E) From this follows I (A + E) + (A + E) 2 = 1, if (A + E) + (A + E) I Together with the estimation of Lemma 7413(b), we obtain [ ] A + 2 A + 2 y x 2 2 E 1 A + 2 x 2 + f 2 + E 2 E 2 1 A + 2 r 2 2 E 2 and y x 2 A [ 2 A E 2 + f ] 2 A + 2 E 2 r 2 + x 2 1 A + 2 E 2 A 2 A 2 x 2 1 A + 2 E 2 A 2 x 2 ] 75 Unsymmetric Lanczos Method Suppose A R n n and that a nonsingular matrix X exists such that α 1 γ 1 0 X 1 β AX = T = 1 α 2 γn 1 0 β n 1 α n Let X = [x 1,, x n ] and X T = Y = [y 1,, y n ] Compared columns in AX = XT and A T Y = Y T T, we find that and Ax j = γ j 1 x j 1 + α j x j + β j x j+1, γ 0 x 0 0 A T y j = β j 1 y j 1 + α j y j + γ j y j+1, β 0 y 0 0 for j = 1,, n 1 These equations together with Y T X = I n imply α j = y T j Ax j and β j x j+1 = γ j (A α j )x j γ j 1 x j 1, γ j y j+1 = p j (A α j ) T y j β j 1 y j 1 (751) These is some flexibility in choosing the scale factors β j and γ j A canonical choice is to set β j = γ j 2 and γ j = x T j+1p j giving:
36 296 Chapter 7 Lanczos Methods Algorithm 751 (Biorthogonalization method of Lanczos) Given x 1, y 1 R n with x T 1 x 1 = y T 1 y 1 = 1 For j = 1,, n 1, α j = y T j Ax j, r j = (A α j )x j γ j 1 x j 1 (γ 0 x 0 0), β j = r j 2 If β j > 0 then x j+1 = r j /β j, p j = (A α j ) T y j β j 1 y j 1 (β 0 y 0 0), γ j = x T j+1p j, else stop; If γ j 0 then y j+1 = p j /γ j else stop; end for α n = x T nay n (752) Define X j = [x 1,, x j ], Y j = [y 1,, y j ] and T j to be the leading j j principal submatrix of T, it is easy to verify that AX j = X j T j + γ j e T j, A T Y j = Y j T T j + p j e T j (753) Remark 751 (i) p T j γ j = β j γ j x T j+1y j+1 = β j γ j from (751) (ii) Break of the algorithm (752) occurs if p T j γ j = 0: (a) γ j = 0 β j = 0 Then X j is an invariant subspace of A (by (753)) (b) p j = 0 γ j = 0 Then Y j is an invariant subspace of A T (by (753)) (c) p T j γ j = 0 but p j γ j = 0, then (752) breaks down We begin the algorithm (752) with a new starting vector (iii) If p T j γ j is very small, then γ j or β j small Hence y j+1 or x j+1 are large, so the algorithm (752) is unstable Definition 751 An upper Hessenberg matrix H = (h ij ) is called unreducible, if h i+1,i 0, for i = 1,, n 1 (that is subdiagonal entries are nonzero) A tridiagonal matrix T = (t ij ) is called unreducible, if t i,i 1 0 for i = 2,, n and t i,i+1 0 for i = 1,, n 1 Theorem 752 Let A R n n Then (i) If x 0 so that K[x 1, A, n] = [x 1, Ax 1,, A n 1 x 1 ] nonsingular and if X is a nonsingular matrix such that K[x 1, A, n] = XR, where R is an upper triangular matrix, then H = X 1 AX is an upper unreducible Hessenberg matrix (ii) Let X be a nonsingular matrix with first column x 1 and if H = X 1 AX is an upper Hessenberg matrix, then holds K[x 1, A, n] = XK[e 1, H, n] XR, where R is an upper triangular matrix Furthermore, if H is unreducible, then R is nonsingular
37 75 Unsymmetric Lanczos Method 297 (iii) If H = X 1 AX and H = Y 1 AY where H and H are both upper Hessenberg matrices, H is unreducible and the first columns x 1 and y 1 of X and Y, respectively, are linearly dependent, then J = X 1 Y is an upper triangular matrix and H = J 1 HJ Proof: ad(i): Since x 1, Ax 1,, A n 1 x 1 are linearly independent, so A n x 1 is the linear combination of {x 1, Ax 1,, A n 1 x 1 }, ie, there exists c 0,, c n 1 such that n 1 A n x 1 = c i A i x 1 Let 0 0 c 0 1 c1 C = c n 1 Then we have K[x 1, A, n]c = [Ax 1, A 2 x 1,, A n 1 x 1, A n x 1 ] = AK[x 1, A, n] XRC = AXR We then have i=0 Thus X 1 AX = RCR 1 = H is an unreducible Hessenberg matrix ad(ii): From A = XHX 1 follows that A i x 1 = XH i X 1 x 1 = XH i e 1 Then K[x 1, A, n] = [x 1, Ax 1,, A n 1 x 1 ] = [Xe 1, XHe 1,, XH n 1 e 1 ] = X[e 1, He 1,, H n 1 e 1 ] If H is upper Hessenberg, then R = [e 1, He 1,, H n 1 e 1 ] is upper triangular If H is unreducible upper Hessenberg, then R is nonsingular, since r 11 = 1, r 22 = h 21, r 33 = h 21 h 32,, and so on ad(iii): Let y 1 = λx 1 We apply (ii) to the matrix H It follows K[x 1, A, n] = XR 1 Applying (ii) to H, we also have K[y 1, A, n] = Y R 2 Here R 1 and R 2 are upper triangular Since y 1 = λx 1, so λk[x 1, A, n] = λxr 1 = Y R 2 Since R 1 is nonsingular, by (ii) we have R 2 is nonsingular and X 1 Y = λr 1 R 1 2 = J is upper triangular So H = Y 1 AY = (Y 1 X)X 1 AX(X 1 Y ) = J 1 HJ Theorem 753 Let A R n n, x, y R n with K[x, A, n] and K[y, A T, n] nonsingular Then (i) If B = K[y, A T, n] T K[x, A, n] = (y T A i+j 2 x) i,j=1,,n has a decomposition B = LDL T, where L is a lower triangular with l ii = 1 and D is diagonal (that is all principal determinants of B are nonzero) and if X = K[x, A, n]l 1, then T = X 1 AX is an unreducible tridiagonal matrix
Krylov Subspace Methods for Large/Sparse Eigenvalue Problems
Krylov Subspace Methods for Large/Sparse Eigenvalue Problems Tsung-Ming Huang Department of Mathematics National Taiwan Normal University, Taiwan April 17, 2012 T.-M. Huang (Taiwan Normal University) Krylov
More informationOrthogonalization and least squares methods
Chapter 3 Orthogonalization and least squares methods 31 QR-factorization (QR-decomposition) 311 Householder transformation Definition 311 A complex m n-matrix R = [r ij is called an upper (lower) triangular
More informationThe Lanczos and conjugate gradient algorithms
The Lanczos and conjugate gradient algorithms Gérard MEURANT October, 2008 1 The Lanczos algorithm 2 The Lanczos algorithm in finite precision 3 The nonsymmetric Lanczos algorithm 4 The Golub Kahan bidiagonalization
More informationLast Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection
Eigenvalue Problems Last Time Social Network Graphs Betweenness Girvan-Newman Algorithm Graph Laplacian Spectral Bisection λ 2, w 2 Today Small deviation into eigenvalue problems Formulation Standard eigenvalue
More informationKrylov subspace projection methods
I.1.(a) Krylov subspace projection methods Orthogonal projection technique : framework Let A be an n n complex matrix and K be an m-dimensional subspace of C n. An orthogonal projection technique seeks
More informationEigenvalue Problems. Eigenvalue problems occur in many areas of science and engineering, such as structural analysis
Eigenvalue Problems Eigenvalue problems occur in many areas of science and engineering, such as structural analysis Eigenvalues also important in analyzing numerical methods Theory and algorithms apply
More informationNumerical Methods for Solving Large Scale Eigenvalue Problems
Peter Arbenz Computer Science Department, ETH Zürich E-mail: arbenz@inf.ethz.ch arge scale eigenvalue problems, Lecture 2, February 28, 2018 1/46 Numerical Methods for Solving Large Scale Eigenvalue Problems
More informationScientific Computing: An Introductory Survey
Scientific Computing: An Introductory Survey Chapter 4 Eigenvalue Problems Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction
More information4.8 Arnoldi Iteration, Krylov Subspaces and GMRES
48 Arnoldi Iteration, Krylov Subspaces and GMRES We start with the problem of using a similarity transformation to convert an n n matrix A to upper Hessenberg form H, ie, A = QHQ, (30) with an appropriate
More informationSolution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method
Solution of eigenvalue problems Introduction motivation Projection methods for eigenvalue problems Subspace iteration, The symmetric Lanczos algorithm Nonsymmetric Lanczos procedure; Implicit restarts
More informationECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems
ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems Part I: Review of basic theory of eigenvalue problems 1. Let A C n n. (a) A scalar λ is an eigenvalue of an n n A
More information13-2 Text: 28-30; AB: 1.3.3, 3.2.3, 3.4.2, 3.5, 3.6.2; GvL Eigen2
The QR algorithm The most common method for solving small (dense) eigenvalue problems. The basic algorithm: QR without shifts 1. Until Convergence Do: 2. Compute the QR factorization A = QR 3. Set A :=
More informationMatrices, Moments and Quadrature, cont d
Jim Lambers CME 335 Spring Quarter 2010-11 Lecture 4 Notes Matrices, Moments and Quadrature, cont d Estimation of the Regularization Parameter Consider the least squares problem of finding x such that
More informationLARGE SPARSE EIGENVALUE PROBLEMS. General Tools for Solving Large Eigen-Problems
LARGE SPARSE EIGENVALUE PROBLEMS Projection methods The subspace iteration Krylov subspace methods: Arnoldi and Lanczos Golub-Kahan-Lanczos bidiagonalization General Tools for Solving Large Eigen-Problems
More informationLarge-scale eigenvalue problems
ELE 538B: Mathematics of High-Dimensional Data Large-scale eigenvalue problems Yuxin Chen Princeton University, Fall 208 Outline Power method Lanczos algorithm Eigenvalue problems 4-2 Eigendecomposition
More informationComputation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.
Math 504, Homework 5 Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated 1 Find the eigenvalues and the associated eigenspaces
More informationLARGE SPARSE EIGENVALUE PROBLEMS
LARGE SPARSE EIGENVALUE PROBLEMS Projection methods The subspace iteration Krylov subspace methods: Arnoldi and Lanczos Golub-Kahan-Lanczos bidiagonalization 14-1 General Tools for Solving Large Eigen-Problems
More informationSolution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method
Solution of eigenvalue problems Introduction motivation Projection methods for eigenvalue problems Subspace iteration, The symmetric Lanczos algorithm Nonsymmetric Lanczos procedure; Implicit restarts
More information6.4 Krylov Subspaces and Conjugate Gradients
6.4 Krylov Subspaces and Conjugate Gradients Our original equation is Ax = b. The preconditioned equation is P Ax = P b. When we write P, we never intend that an inverse will be explicitly computed. P
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 18 Outline
More informationNumerical Methods in Matrix Computations
Ake Bjorck Numerical Methods in Matrix Computations Springer Contents 1 Direct Methods for Linear Systems 1 1.1 Elements of Matrix Theory 1 1.1.1 Matrix Algebra 2 1.1.2 Vector Spaces 6 1.1.3 Submatrices
More informationEIGENVALUE PROBLEMS. EIGENVALUE PROBLEMS p. 1/4
EIGENVALUE PROBLEMS EIGENVALUE PROBLEMS p. 1/4 EIGENVALUE PROBLEMS p. 2/4 Eigenvalues and eigenvectors Let A C n n. Suppose Ax = λx, x 0, then x is a (right) eigenvector of A, corresponding to the eigenvalue
More informationKrylov Subspace Methods that Are Based on the Minimization of the Residual
Chapter 5 Krylov Subspace Methods that Are Based on the Minimization of the Residual Remark 51 Goal he goal of these methods consists in determining x k x 0 +K k r 0,A such that the corresponding Euclidean
More informationApplied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic
Applied Mathematics 205 Unit V: Eigenvalue Problems Lecturer: Dr. David Knezevic Unit V: Eigenvalue Problems Chapter V.4: Krylov Subspace Methods 2 / 51 Krylov Subspace Methods In this chapter we give
More informationAlgebra C Numerical Linear Algebra Sample Exam Problems
Algebra C Numerical Linear Algebra Sample Exam Problems Notation. Denote by V a finite-dimensional Hilbert space with inner product (, ) and corresponding norm. The abbreviation SPD is used for symmetric
More informationPreliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012
Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.
More informationEIGENVALUE PROBLEMS (EVP)
EIGENVALUE PROBLEMS (EVP) (Golub & Van Loan: Chaps 7-8; Watkins: Chaps 5-7) X.-W Chang and C. C. Paige PART I. EVP THEORY EIGENVALUES AND EIGENVECTORS Let A C n n. Suppose Ax = λx with x 0, then x is a
More informationEigenvalues and Eigenvectors
Chapter 1 Eigenvalues and Eigenvectors Among problems in numerical linear algebra, the determination of the eigenvalues and eigenvectors of matrices is second in importance only to the solution of linear
More informationSolving large scale eigenvalue problems
arge scale eigenvalue problems, Lecture 4, March 14, 2018 1/41 Lecture 4, March 14, 2018: The QR algorithm http://people.inf.ethz.ch/arbenz/ewp/ Peter Arbenz Computer Science Department, ETH Zürich E-mail:
More informationOn prescribing Ritz values and GMRES residual norms generated by Arnoldi processes
On prescribing Ritz values and GMRES residual norms generated by Arnoldi processes Jurjen Duintjer Tebbens Institute of Computer Science Academy of Sciences of the Czech Republic joint work with Gérard
More informationBlock Bidiagonal Decomposition and Least Squares Problems
Block Bidiagonal Decomposition and Least Squares Problems Åke Björck Department of Mathematics Linköping University Perspectives in Numerical Analysis, Helsinki, May 27 29, 2008 Outline Bidiagonal Decomposition
More informationEigenvalue Problems and Singular Value Decomposition
Eigenvalue Problems and Singular Value Decomposition Sanzheng Qiao Department of Computing and Software McMaster University August, 2012 Outline 1 Eigenvalue Problems 2 Singular Value Decomposition 3 Software
More informationNumerical Methods - Numerical Linear Algebra
Numerical Methods - Numerical Linear Algebra Y. K. Goh Universiti Tunku Abdul Rahman 2013 Y. K. Goh (UTAR) Numerical Methods - Numerical Linear Algebra I 2013 1 / 62 Outline 1 Motivation 2 Solving Linear
More informationNotes on Eigenvalues, Singular Values and QR
Notes on Eigenvalues, Singular Values and QR Michael Overton, Numerical Computing, Spring 2017 March 30, 2017 1 Eigenvalues Everyone who has studied linear algebra knows the definition: given a square
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 19: More on Arnoldi Iteration; Lanczos Iteration Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 17 Outline 1
More informationMatrix Algorithms. Volume II: Eigensystems. G. W. Stewart H1HJ1L. University of Maryland College Park, Maryland
Matrix Algorithms Volume II: Eigensystems G. W. Stewart University of Maryland College Park, Maryland H1HJ1L Society for Industrial and Applied Mathematics Philadelphia CONTENTS Algorithms Preface xv xvii
More informationFoundations of Matrix Analysis
1 Foundations of Matrix Analysis In this chapter we recall the basic elements of linear algebra which will be employed in the remainder of the text For most of the proofs as well as for the details, the
More informationThe quadratic eigenvalue problem (QEP) is to find scalars λ and nonzero vectors u satisfying
I.2 Quadratic Eigenvalue Problems 1 Introduction The quadratic eigenvalue problem QEP is to find scalars λ and nonzero vectors u satisfying where Qλx = 0, 1.1 Qλ = λ 2 M + λd + K, M, D and K are given
More informationCharles University Faculty of Mathematics and Physics DOCTORAL THESIS. Krylov subspace approximations in linear algebraic problems
Charles University Faculty of Mathematics and Physics DOCTORAL THESIS Iveta Hnětynková Krylov subspace approximations in linear algebraic problems Department of Numerical Mathematics Supervisor: Doc. RNDr.
More informationConjugate Gradient Method
Conjugate Gradient Method Tsung-Ming Huang Department of Mathematics National Taiwan Normal University October 10, 2011 T.M. Huang (NTNU) Conjugate Gradient Method October 10, 2011 1 / 36 Outline 1 Steepest
More informationMath 577 Assignment 7
Math 577 Assignment 7 Thanks for Yu Cao 1. Solution. The linear system being solved is Ax = 0, where A is a (n 1 (n 1 matrix such that 2 1 1 2 1 A =......... 1 2 1 1 2 and x = (U 1, U 2,, U n 1. By the
More informationThe Eigenvalue Problem: Perturbation Theory
Jim Lambers MAT 610 Summer Session 2009-10 Lecture 13 Notes These notes correspond to Sections 7.2 and 8.1 in the text. The Eigenvalue Problem: Perturbation Theory The Unsymmetric Eigenvalue Problem Just
More informationChapter 7 Iterative Techniques in Matrix Algebra
Chapter 7 Iterative Techniques in Matrix Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128B Numerical Analysis Vector Norms Definition
More informationIterative methods for Linear System
Iterative methods for Linear System JASS 2009 Student: Rishi Patil Advisor: Prof. Thomas Huckle Outline Basics: Matrices and their properties Eigenvalues, Condition Number Iterative Methods Direct and
More informationIndex. for generalized eigenvalue problem, butterfly form, 211
Index ad hoc shifts, 165 aggressive early deflation, 205 207 algebraic multiplicity, 35 algebraic Riccati equation, 100 Arnoldi process, 372 block, 418 Hamiltonian skew symmetric, 420 implicitly restarted,
More information5.6. PSEUDOINVERSES 101. A H w.
5.6. PSEUDOINVERSES 0 Corollary 5.6.4. If A is a matrix such that A H A is invertible, then the least-squares solution to Av = w is v = A H A ) A H w. The matrix A H A ) A H is the left inverse of A and
More informationGaussian Elimination for Linear Systems
Gaussian Elimination for Linear Systems Tsung-Ming Huang Department of Mathematics National Taiwan Normal University October 3, 2011 1/56 Outline 1 Elementary matrices 2 LR-factorization 3 Gaussian elimination
More informationNumerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??
Metodi Numerici M p. 1/?? Numerical Methods Elena loli Piccolomini Civil Engeneering http://www.dm.unibo.it/ piccolom elena.loli@unibo.it Metodi Numerici M p. 2/?? Least Squares Data Fitting Measurement
More informationMath 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination
Math 0, Winter 07 Final Exam Review Chapter. Matrices and Gaussian Elimination { x + x =,. Different forms of a system of linear equations. Example: The x + 4x = 4. [ ] [ ] [ ] vector form (or the column
More informationEE731 Lecture Notes: Matrix Computations for Signal Processing
EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University October 17, 005 Lecture 3 3 he Singular Value Decomposition
More informationThe German word eigen is cognate with the Old English word āgen, which became owen in Middle English and own in modern English.
Chapter 4 EIGENVALUE PROBLEM The German word eigen is cognate with the Old English word āgen, which became owen in Middle English and own in modern English. 4.1 Mathematics 4.2 Reduction to Upper Hessenberg
More informationMath 504 (Fall 2011) 1. (*) Consider the matrices
Math 504 (Fall 2011) Instructor: Emre Mengi Study Guide for Weeks 11-14 This homework concerns the following topics. Basic definitions and facts about eigenvalues and eigenvectors (Trefethen&Bau, Lecture
More informationσ 11 σ 22 σ pp 0 with p = min(n, m) The σ ii s are the singular values. Notation change σ ii A 1 σ 2
HE SINGULAR VALUE DECOMPOSIION he SVD existence - properties. Pseudo-inverses and the SVD Use of SVD for least-squares problems Applications of the SVD he Singular Value Decomposition (SVD) heorem For
More information5 Selected Topics in Numerical Linear Algebra
5 Selected Topics in Numerical Linear Algebra In this chapter we will be concerned mostly with orthogonal factorizations of rectangular m n matrices A The section numbers in the notes do not align with
More informationConjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)
Conjugate gradient method Descent method Hestenes, Stiefel 1952 For A N N SPD In exact arithmetic, solves in N steps In real arithmetic No guaranteed stopping Often converges in many fewer than N steps
More informationAM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition
AM 205: lecture 8 Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition QR Factorization A matrix A R m n, m n, can be factorized
More informationEigenvalue Problems CHAPTER 1 : PRELIMINARIES
Eigenvalue Problems CHAPTER 1 : PRELIMINARIES Heinrich Voss voss@tu-harburg.de Hamburg University of Technology Institute of Mathematics TUHH Heinrich Voss Preliminaries Eigenvalue problems 2012 1 / 14
More informationIterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)
Iterative methods for Linear System of Equations Joint Advanced Student School (JASS-2009) Course #2: Numerical Simulation - from Models to Software Introduction In numerical simulation, Partial Differential
More informationApplied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic
Applied Mathematics 205 Unit II: Numerical Linear Algebra Lecturer: Dr. David Knezevic Unit II: Numerical Linear Algebra Chapter II.3: QR Factorization, SVD 2 / 66 QR Factorization 3 / 66 QR Factorization
More informationWe will discuss matrix diagonalization algorithms in Numerical Recipes in the context of the eigenvalue problem in quantum mechanics, m A n = λ m
Eigensystems We will discuss matrix diagonalization algorithms in umerical Recipes in the context of the eigenvalue problem in quantum mechanics, A n = λ n n, (1) where A is a real, symmetric Hamiltonian
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 16: Reduction to Hessenberg and Tridiagonal Forms; Rayleigh Quotient Iteration Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical
More informationComputational Methods. Eigenvalues and Singular Values
Computational Methods Eigenvalues and Singular Values Manfred Huber 2010 1 Eigenvalues and Singular Values Eigenvalues and singular values describe important aspects of transformations and of data relations
More informationIntroduction to Iterative Solvers of Linear Systems
Introduction to Iterative Solvers of Linear Systems SFB Training Event January 2012 Prof. Dr. Andreas Frommer Typeset by Lukas Krämer, Simon-Wolfgang Mages and Rudolf Rödl 1 Classes of Matrices and their
More informationA HARMONIC RESTARTED ARNOLDI ALGORITHM FOR CALCULATING EIGENVALUES AND DETERMINING MULTIPLICITY
A HARMONIC RESTARTED ARNOLDI ALGORITHM FOR CALCULATING EIGENVALUES AND DETERMINING MULTIPLICITY RONALD B. MORGAN AND MIN ZENG Abstract. A restarted Arnoldi algorithm is given that computes eigenvalues
More informationTopics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems
Topics The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems What about non-spd systems? Methods requiring small history Methods requiring large history Summary of solvers 1 / 52 Conjugate
More informationApproximating the matrix exponential of an advection-diffusion operator using the incomplete orthogonalization method
Approximating the matrix exponential of an advection-diffusion operator using the incomplete orthogonalization method Antti Koskela KTH Royal Institute of Technology, Lindstedtvägen 25, 10044 Stockholm,
More informationOn the influence of eigenvalues on Bi-CG residual norms
On the influence of eigenvalues on Bi-CG residual norms Jurjen Duintjer Tebbens Institute of Computer Science Academy of Sciences of the Czech Republic duintjertebbens@cs.cas.cz Gérard Meurant 30, rue
More informationM.A. Botchev. September 5, 2014
Rome-Moscow school of Matrix Methods and Applied Linear Algebra 2014 A short introduction to Krylov subspaces for linear systems, matrix functions and inexact Newton methods. Plan and exercises. M.A. Botchev
More informationNumerical Solution of Linear Eigenvalue Problems
Numerical Solution of Linear Eigenvalue Problems Jessica Bosch and Chen Greif Abstract We review numerical methods for computing eigenvalues of matrices We start by considering the computation of the dominant
More informationMATH36001 Generalized Inverses and the SVD 2015
MATH36001 Generalized Inverses and the SVD 201 1 Generalized Inverses of Matrices A matrix has an inverse only if it is square and nonsingular. However there are theoretical and practical applications
More informationKrylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17
Krylov Space Methods Nonstationary sounds good Radu Trîmbiţaş Babeş-Bolyai University Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17 Introduction These methods are used both to solve
More informationA short course on: Preconditioned Krylov subspace methods. Yousef Saad University of Minnesota Dept. of Computer Science and Engineering
A short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer Science and Engineering Universite du Littoral, Jan 19-3, 25 Outline Part 1 Introd., discretization
More informationMatrix Factorization and Analysis
Chapter 7 Matrix Factorization and Analysis Matrix factorizations are an important part of the practice and analysis of signal processing. They are at the heart of many signal-processing algorithms. Their
More informationComputing Eigenvalues and/or Eigenvectors;Part 2, The Power method and QR-algorithm
Computing Eigenvalues and/or Eigenvectors;Part 2, The Power method and QR-algorithm Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo November 19, 2010 Today
More informationThis can be accomplished by left matrix multiplication as follows: I
1 Numerical Linear Algebra 11 The LU Factorization Recall from linear algebra that Gaussian elimination is a method for solving linear systems of the form Ax = b, where A R m n and bran(a) In this method
More informationCourse Notes: Week 1
Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues
More informationKrylov Subspaces. Lab 1. The Arnoldi Iteration
Lab 1 Krylov Subspaces Lab Objective: Discuss simple Krylov Subspace Methods for finding eigenvalues and show some interesting applications. One of the biggest difficulties in computational linear algebra
More informationEigenvalues and eigenvectors
Chapter 6 Eigenvalues and eigenvectors An eigenvalue of a square matrix represents the linear operator as a scaling of the associated eigenvector, and the action of certain matrices on general vectors
More informationProperties of Matrices and Operations on Matrices
Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,
More informationQR-decomposition. The QR-decomposition of an n k matrix A, k n, is an n n unitary matrix Q and an n k upper triangular matrix R for which A = QR
QR-decomposition The QR-decomposition of an n k matrix A, k n, is an n n unitary matrix Q and an n k upper triangular matrix R for which In Matlab A = QR [Q,R]=qr(A); Note. The QR-decomposition is unique
More informationG1110 & 852G1 Numerical Linear Algebra
The University of Sussex Department of Mathematics G & 85G Numerical Linear Algebra Lecture Notes Autumn Term Kerstin Hesse (w aw S w a w w (w aw H(wa = (w aw + w Figure : Geometric explanation of the
More informationAPPLIED NUMERICAL LINEAR ALGEBRA
APPLIED NUMERICAL LINEAR ALGEBRA James W. Demmel University of California Berkeley, California Society for Industrial and Applied Mathematics Philadelphia Contents Preface 1 Introduction 1 1.1 Basic Notation
More informationSparse matrix methods in quantum chemistry Post-doctorale cursus quantumchemie Han-sur-Lesse, Belgium
Sparse matrix methods in quantum chemistry Post-doctorale cursus quantumchemie Han-sur-Lesse, Belgium Dr. ir. Gerrit C. Groenenboom 14 june - 18 june, 1993 Last revised: May 11, 2005 Contents 1 Introduction
More informationLecture 4 Orthonormal vectors and QR factorization
Orthonormal vectors and QR factorization 4 1 Lecture 4 Orthonormal vectors and QR factorization EE263 Autumn 2004 orthonormal vectors Gram-Schmidt procedure, QR factorization orthogonal decomposition induced
More informationNumerical Methods I Eigenvalue Problems
Numerical Methods I Eigenvalue Problems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 2nd, 2014 A. Donev (Courant Institute) Lecture
More informationLeast-Squares Systems and The QR factorization
Least-Squares Systems and The QR factorization Orthogonality Least-squares systems. The Gram-Schmidt and Modified Gram-Schmidt processes. The Householder QR and the Givens QR. Orthogonality The Gram-Schmidt
More informationThe Singular Value Decomposition and Least Squares Problems
The Singular Value Decomposition and Least Squares Problems Tom Lyche Centre of Mathematics for Applications, Department of Informatics, University of Oslo September 27, 2009 Applications of SVD solving
More informationMAT 610: Numerical Linear Algebra. James V. Lambers
MAT 610: Numerical Linear Algebra James V Lambers January 16, 2017 2 Contents 1 Matrix Multiplication Problems 7 11 Introduction 7 111 Systems of Linear Equations 7 112 The Eigenvalue Problem 8 12 Basic
More informationbe a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u
MATH 434/534 Theoretical Assignment 7 Solution Chapter 7 (71) Let H = I 2uuT Hu = u (ii) Hv = v if = 0 be a Householder matrix Then prove the followings H = I 2 uut Hu = (I 2 uu )u = u 2 uut u = u 2u =
More informationThe QR Factorization
The QR Factorization How to Make Matrices Nicer Radu Trîmbiţaş Babeş-Bolyai University March 11, 2009 Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 1 / 25 Projectors A projector
More information5.3 The Power Method Approximation of the Eigenvalue of Largest Module
192 5 Approximation of Eigenvalues and Eigenvectors 5.3 The Power Method The power method is very good at approximating the extremal eigenvalues of the matrix, that is, the eigenvalues having largest and
More informationLecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm
CS 622 Data-Sparse Matrix Computations September 19, 217 Lecture 9: Krylov Subspace Methods Lecturer: Anil Damle Scribes: David Eriksson, Marc Aurele Gilles, Ariah Klages-Mundt, Sophia Novitzky 1 Introduction
More informationKey words. conjugate gradients, normwise backward error, incremental norm estimation.
Proceedings of ALGORITMY 2016 pp. 323 332 ON ERROR ESTIMATION IN THE CONJUGATE GRADIENT METHOD: NORMWISE BACKWARD ERROR PETR TICHÝ Abstract. Using an idea of Duff and Vömel [BIT, 42 (2002), pp. 300 322
More informationON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH
ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH V. FABER, J. LIESEN, AND P. TICHÝ Abstract. Numerous algorithms in numerical linear algebra are based on the reduction of a given matrix
More informationPreconditioned inverse iteration and shift-invert Arnoldi method
Preconditioned inverse iteration and shift-invert Arnoldi method Melina Freitag Department of Mathematical Sciences University of Bath CSC Seminar Max-Planck-Institute for Dynamics of Complex Technical
More informationFEM and sparse linear system solving
FEM & sparse linear system solving, Lecture 9, Nov 19, 2017 1/36 Lecture 9, Nov 17, 2017: Krylov space methods http://people.inf.ethz.ch/arbenz/fem17 Peter Arbenz Computer Science Department, ETH Zürich
More informationSTAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9
STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9 1. qr and complete orthogonal factorization poor man s svd can solve many problems on the svd list using either of these factorizations but they
More informationEECS 275 Matrix Computation
EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 17 1 / 26 Overview
More informationIntroduction to Numerical Linear Algebra II
Introduction to Numerical Linear Algebra II Petros Drineas These slides were prepared by Ilse Ipsen for the 2015 Gene Golub SIAM Summer School on RandNLA 1 / 49 Overview We will cover this material in
More informationArnoldi Methods in SLEPc
Scalable Library for Eigenvalue Problem Computations SLEPc Technical Report STR-4 Available at http://slepc.upv.es Arnoldi Methods in SLEPc V. Hernández J. E. Román A. Tomás V. Vidal Last update: October,
More information