Weighted Golub-Kahan-Lanczos algorithms and applications

Size: px

Start display at page:

Download "Weighted Golub-Kahan-Lanczos algorithms and applications"

Melanie Owen
5 years ago
Views:

1 Weighted Golub-Kahan-Lanczos algorithms and applications Hongguo Xu and Hong-xiu Zhong January 13, 2016 Abstract We present weighted Golub-Kahan-Lanczos algorithms for computing a bidiagonal form of a product of two invertible matrices We give convergence analysis for the algorithms when they are used for computing extreme eigenvalues of a product of two symmetric positive definite matrices We apply the algorithms for computing extreme eigenvalues of a double-sized matrix arising from linear response problem and provide convergence results Numerical testing results are also reported As another application we show a connection between the proposed algorithms and the conjugate gradient (CG) algorithm for positive definite linear systems Keywords: Weighted Golub-Kahan-Lanczos algorithm, Eigenvalue, Eigenvector, Ritz value, Ritz vector, Linear response eigenvalue problem, Krylov subspace, Bidiagonal matrices AMS subject classifications: 65F15, 15A18 1 Introduction The Golub-Kahan bidiagonalization procedure is a fundamental tool in the singular value decomposition (SVD) method (8) Based on such a procedure, Golub-Kahan-Lanczos algorithms were developed in 15 that provide powerful Krylov subspace methods for solving large scale singular value and related eigenvalue problems, as well as least squares and saddlepoint problems, 15, 16 Later, weighted Golub-Kahan-Lanczos algorithms were introduced for solving generalized least squares and saddle-point problems, eg, 1, 4 In this paper we propose a type of weighted Golub-Kahan-Lanczos bidiagonalization algorithms The proposed algorithms are similar to the generalized Golub-Kahan-Lanczos bidiagonalization algorithms given in 1 There, the algorithms are based on the Golub-Kahan bidiagonalizations of matrices of the form R 1 AL T, whereas ours are based the bidagonalizations of R T L, where R, L are invertible matrices When R and L are factors of real symmetric positive definite matrices: K = LL T, M = RR T, similar to those in 1, the proposed algorithms use the matrices K and M directly without computing the factors L and R Therefore, the algorithms will be advantageous when K and M are large sparse while L and R may be not We demonstrate that the algorithms can be simply employed for solving large scale eigenvalue problems of the matrices in the product form MK As an application, we also use 1 Department of Mathematics, University of Kansas, Lawrence, KS, 66046, USA xu@mathuedu 2 Department of Mathematics, East China Normal University, Shanghai, , PR China zhonghongxiu@gmailcom and zhonghongxiu@126com Supported by National Science Foundation of China under grant and by China Scholarship Council (CSC) during a visit at University of Kansas 1

2 the proposed algorithms for solving the eigenvalue problem of the matrices in the form 0 M H = with symmetric positive definite K and M The eigenvalue problem of K 0 such a matrix arises from linear response problem in the time-dependent density functional theory 5, 11, 14, 19 Related eigenvalue properties and numerical eigenvalue methods can be founded in 2, 3, 10, 12, 13, 20, 23, 24, 25 For a positive definite linear system it is well nown that the conjugate gradient (CG) method is equivalent to the standard Lanczos method, eg, 21, Sec 67 and 9 As another application, we demonstrate that in the case when K or M is identity, the Krylov subspace linear system solver based on one of the proposed algorithms has a simpler and more direct connection to the CG method In its original version (when neither K nor M is identity), the solver is mathematically equivalent to a weighted CG method The paper is organized as follows In Section 2 we present some basic results that are needed for deriving our algorithms and main results In Section 3, we construct the weighted Golub-Kahan-Lanczos algorithms and show how the algorithms can be employed to solve the eigenvalue problem of a matrix as a product of two symmetric positive definite matrices Application of the weighted Golub-Kahan-Lanczos algorithms to the eigenvalue problem of matrix H from linear response problem is described in Section 4, and numerical examples are given in Section 5 In Section 6, the relation between a weighted Golub-Kahan-Lanczos algorithm and a weighted CG method is provided Section 7 contains some concluding remars Throughout the paper, R is the real field, R m n is the set of m n real matrices, R n is the n-dimensional real vector space, I n is the n n identity matrix, and e j is its jth column The notation A > 0( 0) means that the matrix A is symmetric and positive definite (semidefinite) For a given matrix A R n n and a vector b R n, the th Krylov subspace of A with b, ie, the subspace spanned by the set of vectors {b, Ab,, A 1 b}, is denoted by K (A, b) is the spectral norm for matrices and 2-norm for vectors For a symmetric positive definite matrix A, A denotes the vector norm defined by x A = x T Ax for any vector x, and a matrix X is A-orthonormal if X T AX = I (it is A-orthogonal if X is a square matrix) A set of vectors {x 1,, x } is also called A-orthonormal if X = x 1 x is A-orthonormal and A-orthogonal if x T i Ax j = 0 for i j σ max (A) and σ min (A) are the largest and the smallest singular values of matrix A, respectively; and κ 2 (A) = σ max (A)/σ min (A) (when σ min (A) > 0) is the spectral condition number of A In the paper we restrict ourselves to the real case All the results can be simply extended to the complex case 2 Preliminaries Suppose K, M R n n are both symmetric and positive definite There exist invertible matrices L and R such that K = LL T, M = RR T (1) Consider the eigenvalue problem of the matrix products MK or KM = (MK) T Since MK = RR T LL T = L T (R T L) T (R T L)L T = R(R T L)(R T L) T R 1, MK is similar to the symmetric positive definite matrix (R T L) T (R T L) (and (R T L)(R T L) T ) Let R T L = ΦΛΨ T (2) 2

3 be a singular value decomposition (SVD), where Φ and Ψ are real orthogonal and Λ = diag(λ 1,, λ n ) with λ 1 λ n > 0 (since R T L is invertible) Then MK = (L T Ψ)Λ 2 (L T Ψ) 1 = (RΦ)Λ 2 (RΦ) 1 (3) Therefore, we have the following basic properties Lemma 1 Let R T L have an SVD (2) Then (a) The eigenvalues of MK (and KM) are λ 2 1,, λ2 n (b) The columns of L T Ψ and RΦ = L T ΨΛ are the corresponding right (left) eigenvectors associated with MK (KM), and L T Ψ is K-orthogonal and RΦ is M 1 -orthogonal (c) The columns of LΨ = R T ΦΛ and R T Φ are the corresponding left (right) eigenvectors associated with MK (KM), and LΨ is K 1 -orthogonal and R T Φ is M-orthogonal Proof (a) is simply from (3) and the fact KM = (MK) T (b) is also from (3) and the fact that (L T Ψ) T K(L T Ψ) = I = (RΦ) T M 1 (RΦ) The relation RΦ = L T ΨΛ is from (2) (c) can be proved in the same way as for (b) by considering KM = (LΨ)Λ 2 (LΨ) 1 = (R T Φ)Λ 2 (R T Φ) 1, which is obtained by taing the transpose on (3) Lemma 1 shows that the matrix MK has a complete set of K-orthonormal right eigenvectors and a complete set of M 1 -orthonormal right eigenvectors It has also a complete set of K 1 -orthonormal left eigenvectors and a complete set of M-orthonormal left eigenvectors These four sets of eigenvectors are related and the relations can be established without using the orthogonal matrices Φ and Ψ, as shown in the following lemma Lemma 2 Suppose W = ξ 1 ξ 2 ξ n and ξ1,, ξ n are K-orthonormal right eigenvectors of MK corresponding to the eigenvalues λ 2 1,, λ2 n so that MKW = W Λ 2 Then (a) the columns of W Λ = λ 1 ξ 1 λ 2 ξ 2 λ n ξ n form a complete set of M 1 -orthonormal right eigenvectors of M K; (b) the columns of KW = Kξ 1 Kξ 2 Kξ n = M 1 W Λ 2 form a complete set of K 1 -orthonormal left eigenvectors of MK; (c) the columns of KW Λ 1 = λ 1 1 Kξ 1 λ 1 2 Kξ 2 λ 1 n Kξ n = M 1 W Λ form a complete set of M-orthonormal left eigenvectors of M K Proof The claim in (a) follows from the equivalence relation MKW = W Λ 2 if and only if MK(W Λ) = (W Λ)Λ 2, and the fact that KW = M 1 W Λ 2 implies I = W T KW = W T M 1 W Λ 2, which is equivalent to (W Λ) T M 1 (W Λ) = I The claim in (b) follows from the equivalence relation MKW = W Λ 2 if and only if KM(KW ) = (KW )Λ 2 and the fact (KW ) T K 1 (KW ) = W T KW = I The claim in (c) follows from the relations KM(KW Λ 1 ) = (KW Λ 1 )Λ 2 and (KW Λ 1 ) T M(KW Λ 1 ) = Λ 1 W T K(MKW )Λ 1 = Λ 1 W T KW Λ 2 Λ 1 = I 3

4 Therefore, once one of the four complete sets is nown, there other three sets can be simply constructed From Lemma 1, one can choose W = L T Ψ when Ψ is nown Then it is easily verified that W = L T Ψ, KW = LΨ, W Λ = RΦ, KW Λ 1 = R T Φ (4) 3 Weighted Golub-Kahan-Lanczos algorithms In principle, the eigenvalue problem of MK or KM can be solved as follows: we first compute the factorizations in (1), then compute an SVD of R T L as (2) Computing an SVD of R T L requires an initial step for determining real orthogonal matrices U and V such that U T R T LV = B, (5) where B is upper or lower bidiagonal (8) An SVD B = ΦΛ Ψ T can be computed by running SVD iterations Then one has (2) with Φ = U Φ and Ψ = V Ψ For large scale problem, the columns of U, V and the entries of B can be computed by the Golub-Kahan-Lanczos algorithm proposed in 15, and the singular values of the leading principal submatrices of B are used for approximating the singular values of R T L, therefore, also the eigenvalues of MK and KM This algorithm requires the matrices L and R If the matrices K and M are given, computing L and R is time consuming, and normally L and R may not be as sparse as K and M This causes a numerical difficulty for using Lanczos-type methods We propose weighted Golub-Kahan-Lanczos algorithms to overcome this difficulty The algorithms apply to the matrices K and M directly instead of their factors They are constructed based on the following observations Define where U and V are given in (5) It is easily verified that X = R T U, Y = L T V, (6) X T MX = U T R 1 MR T U = U T U = I, Y T KY = V T L 1 KL T V = V T V = I, ie, X is M-orthonormal and Y is K-orthonormal The bidiagonal form (5) and its transpose can be written as LV = R T UB, RU = L T V B T Since we have LV = LL T (L T V ) = KY, RU = RR T (R T U) = MX, KY = XB, MX = Y B T (7) Based on these two relations and the orthogonality of X and Y, Lanczos-type iteration procedures can be easily constructed There are two Lanczos-type procedures depending on whether B is upper or lower bidiagonal We first consider the upper bidiagonal case and we call the procedure the upper bidiagonal version of weighted Golub-Kahan-Lanczos algorithm (WGKLU) Denote X = x 1 x n, Y = y1 y n, 4

5 and α 1 β 1 B = α 2 βn 1 By comparing the columns of the relations in (7), one has α n Ky 1 = α 1 x 1, Mx 1 = α 1 y 1 + β 1 y 2, Ky 2 = β 1 x 1 + α 2 x 2, Mx 2 = α 2 y 2 + β 2 y 3, Ky = β 1 x 1 + α x, Mx = α y + β y +1, Ky n = β n 1 x n 1 + α n x n, Mx n = α n y n Choosing an initial vector y 1 satisfying y T 1 Ky 1 = 1, and using the orthogonality x T i Mx j = y T i Ky j = δ ij, where δ ij is 0 if i j and is 1 if i = j, then the columns of X and Y, as well as the entries of B, can be computed by the following iterations α j = Ky j β j 1 x j 1 M x j = (Ky j β j 1 x j 1 )/α j β j = Mx j α j y j K y j+1 = (Mx j α j y j )/β j with x 0 = 0 and β 0 = 1, for j = 1, 2, The following observations help to mae the iterations more efficient Computing α j requires the vector f j := M(Ky j β j 1 x j 1 ), which equals α j Mx j The vector Mx j appears in Mx j α j y j needed for computing β j and y j+1, which can now be computed using f j /α j In this way, we save one matrix-vector multiplication Similarly, computing β j needs the vector g j+1 := K(Mx j α j y j ) = β j Ky j+1 The vector Ky j+1 involved in computing α j+1 and x j+1 in the next iteration can then be computed using g j+1 /β j So another matrix-vector multiplication can be saved We now have the following WGKLU algorithm Algorithm 1 (WGKLU) Choose y 1 satisfying y 1 K = 1, and set β 0 = 1, x 0 = 0 Compute g 1 = Ky 1 For j = 1, 2, s j = g j /β j 1 β j 1 x j 1 f j = Ms j α j = (s T j f j) 1 2 x j = s j /α j t j+1 = f j /α j α j y j g j+1 = Kt j+1 β j = (t T j+1 g j+1) 1 2 y j+1 = t j+1 /β j End 5

6 In each iteration, this algorithm requires two matrix-vector multiplications, and it needs 6 vectors f, x 1, x, g, y, y +1 to be involved (x, y +1 may overwrite s and t +1 ) Suppose Algorithm 1 runs iterations we have x 1,, x, y 1,, y +1 and α j, β j for j = 1,, For any j 0 define X j = x 1 x j, Yj = y 1 y j, Bj = α 1 β 1 βj 1 Then we have the relations KY = X B, MX = Y B T + β y +1 e T = Y +1 B T β e, (8) and α j X T MX = I = Y T KY (9) When = n, it has to be β n = 0 and we have (5) Algorithm 1 can be implemented for computing the extreme eigenvalues of MK and KM directly From (8), one has MKY = Y B T B + α β y +1 e T, KMX = X (B B T + β2 e e T ) + α +1β x +1 e T (10) The matrices B B T +β2 e e T and BT B are symmetric tridiagonal So WGKLU is equivalent to two weighted versions of Lanczos algorithm applied to the matrices MK and (MK) T, respectively Clearly, x 1 is parallel to the vector Ky 1 From the relations in (10), one has range Y = K (MK, y 1 ), range X = K (KM, Ky 1 ) = KK (MK, y 1 ) (11) Looing closely, suppose K, M are factorized as in (1), and let U = R T X, V = L T Y Then from (9), U T U = V T V = I, and the relations in (10) are equivalent to L T MLV = V B T B +α β v +1 e T, RT KRU = U (B B T +β2 e e T )+α +1β u +1 e T, (12) which are identical to the relations obtained by applying the standard Lanczos algorithm to the symmetric positive definite matrices L T ML and R T KR, respectively The relations also show that WGKLU breas down only when β = 0 for some Otherwise, if α = 0 while β 0, one has L T MLV = V B T B, which implies that V T LT MLV is singular, contradicting to the positive definiteness of L T ML Going bac to (10), if β = 0, one has MKY = Y B T B and KMX = X B B T Then the columns of Y and X, respectively, span an invariant subspace of MK and KM, and the squares of the singular values of B are the eigenvalues of MK and KM If β > 0, we may still compute the singular values and singular vectors of B and B β e, and use them to approximate the eigenvalues and eigenvectors of MK and KM We first consider the approximations based on the first relation of (10) Suppose B has an SVD B = Φ Σ ΨT, Φ = µ 1 µ, Ψ = ν 1 ν, Σ = diag(σ 1,, σ ) (13) 6

7 Then from the first relation in (10) for each j {1,, }, we may tae σj 2 as a Ritz value of MK and Y ν j as a corresponding right Ritz vector Since Y is K-orthonormal and Ψ is real orthogonal, Y ν 1,, Y ν are K-orthonormal Also, we have the residual (MK σ 2 j I)Y ν j = α β ν j y +1, where ν j is the th component of ν j, for j = 1,, In view of Lemma 2 (c) or by premultiplying K to the the first equation in (10), it is natural to tae KY ν j or equivalently, σj 1 KY ν j = σj 1 X B ν j = X µ j, j = 1,, (14) as the corresponding left Ritz vectors Note X µ 1,, X µ are M-orthonormal For each j {1,, }, the corresponding residual, following (10), is the transpose of (KM σ 2 j I)X µ j = β µ j (β x + α +1 x +1 ) = β µ j Ky +1, where µ j is the th component of µ j In practice, we may use the residual norms (MK σj 2 I)Y ν j K = α β ν j, (KM σj 2 I)X µ j M = β µ j β 2 + α2 +1 to design a stopping criterion for WGKLU The convergence properties can be readily established by employing the convergence theory of the standard Lanczos algorithm (9, 18, 22) We need the following definitions For two vectors 0 x, y R n and 0 < A R n n, we define the angles θ(x, y) = arccos xt y x y, θ A(x, y) = arccos xt Ay x A y A We also denote by C j (x) the degree j Chebyshev polynomial of the first ind The following convergence results are based on the theory given in 22 Theorem 3 Let λ 2 1 λ2 2 λ2 n > 0 be the eigenvalues of MK with λ j > 0 for j = 1,, n, and ξ 1,, ξ n be the corresponding K-orthonormal right eigenvectors, and following Lemma 2 (c) let η j := λ 1 j Kξ j,j = 1,, n be the corresponding M-orthonormal left eigenvectors Suppose that B has an SVD (13) with σ 1 σ 2 σ > 0 Consider the Ritz values σ1 2,, σ2, the corresponding K-orthonormal right Ritz vectors Y ν 1,, Y ν, and the M-orthonormal left Ritz vectors X µ 1,, X µ, associated with MK Let γ j = λ2 j λ2 j+1 λ 2 j+1, γ j = λ2 n +j 1 λ2 n +j λ2 n λ 2 1 ; 1 j, λ2 n +j 1 and y 1 be the initial vector in Algorithm 1 Then for j = 1,,, 0 λ 2 j σ 2 j (λ 2 1 λ 2 n) ( ) πj, tan θ K (y 1, ξ j ) 2 (15) C j (1 + 2γ j ) with π 1, = 1; π j, = j 1 i=1 σi 2 λ2 n σi 2, j > 1; λ2 j 7

8 and with ( ) 0 σj 2 λ 2 n +j (λ2 1 λ 2 πj, tan θ K (y 1, ξ n +j ) 2 n) (16) C j 1 (1 + 2 γ j ) n π, = 1; π j, = i=n +j+1 σi 2 λ2 1 σi 2, j < ; λ2 n +j The corresponding Ritz vectors have the following bounds: (σj ) 2 ( ) 2 sin 2 σj θ M (X µ j, η j ) + 1 λ j λ j with δ j = min i j λ 2 j σ2 i and = sin θ K (Y ν j, ξ j ) π j 1 + (α β ) 2 /δ 2 j C j (1 + 2γ j ) sin θ K (y 1, ξ j ) (17) π 1 = 1, π j = j 1 i=1 λ 2 i λ2 n λ 2 i, j > 1; λ2 j ( σj ) 2 ( σj sin θ M (X µ j, η n +j) + 1 ) 2 λ n +j with δ j = min i j λ 2 n +j σ2 i and = sin θ K (Y ν j, ξ n +j ) π j π = 1, π j = n i=n +j+1 λ n +j 1 + (α β ) 2 / δ 2 j C j 1 (1 + 2 γ j ) λ 2 i λ2 1 λ 2 i, j < λ2 n +j sin θ K (y 1, ξ n +j ) (18) Proof We first prove (15) and (17) From Lemma 1, the K-orthonormal right eigenvectors ξ 1,, ξ n are well defined For any j the vector L T ξ j is a unit eigenvector of L T ML corresponding to the eigenvalue λ 2 j, and LT ξ 1,, L T ξ n are orthonormal From the first equation of (12), σ 2 1,, σ2 are the Ritz values of LT ML and for V = L T Y, V ν 1,, V ν are the corresponding orthonormal right (left) Ritz vectors Applying the standard Lanczos convergence results in 22, Sec 66 to the first relation in (12), one has ( 0 λ 2 j σj 2 (λ 2 1 λ 2 πj, tan θ(l T y 1, L T ξ j ) n) C j (1 + 2γ j ) ) 2 sin θ(v ν j, L T ξ j ) π j 1 + (α β ) 2 /δ 2 j C j (1 + γ j ) sin θ(l T y 1, L T ξ j ), where π j,, π j, δ j, γ j are defined in the theorem The bounds (15) and (17) can be derived simply by using the equalities θ(l T y 1, L T ξ j ) = θ K (y 1, ξ j ), θ(v ν j, L T ξ j ) = θ(l T Y ν j, L T ξ j ) = θ K (Y ν j, ξ j ) 8

9 We still need to prove the equality in (17) By (14), Hence cos θ M (η j, X µ j ) = cos θ M (Kξ j /λ j, KY ν j /σ j ) = νt j Y T KMKξ j σ j λ j = λ j σ j ν T j Y T Kξ j = λ j σ j cos θ K (Y ν j, ξ j ) cos θ K (Y ν j, ξ j ) = σ j λ j cos θ M (X µ j, η j ), (19) and from which one has (σj ) 2 sin θ K (Y ν j, ξ j ) = sin 2 θ M (X µ j, η j ) + 1 λ j The bounds (16) and (18) can be proved by applying the proved results to the matrix MK The equality in (18) can be established by using the identity cos θ K (Y ν j, ξ n +j ) = σ j cos θ M (X µ j, η n +j ), (20) λ n +j which can be derived in the same way as for (19) The second relation in (10) can to used to approximate the eigenvalues and eigenvectors of MK too Let ρ B β e ω1 ω +1 = ζ1 ζ (21) 0 ρ 0 be an SVD Then ρ 2 1,, ρ2 are Ritz values and X ζ 1,, X ζ are corresponding M-orthonormal left (right) Ritz vectors of MK (KM) The residual (in transpose) formula is (KM ρ 2 ji)x ζ j = α +1 β ζ j x +1, where ζ j is the th component of ζ j Following Lemma 2 (c), or by pre-multiplying M to the second equation in (10) and using (8), the K-orthonormal vectors ( σj ρ 1 j MX ζ j = ρ 1 T j Y +1 B β e ζj = Y +1 ω j, j = 1,,, (22) can be served as the right Ritz vectors of MK From the first equation of (10) with replaced by + 1, MKY +1 = Y +1 B β e T B β e + α+1 (α +1 y +1 + β +1 y +2 )e T +1 = Y +1 B β e T B β e + α+1 Mx +1 e T +1 So for each j {1,, }, the residual is (MK ρ 2 ji)y +1 ω j = α +1 ω j,+1 Mx +1 = α +1 ω j,+1 (α +1 y +1 + β +1 y +2 ), where ω j,+1 is the ( + 1)st component of ω j Hence (KM ρ 2 ji)x ζ j M = α +1 β ζ j, (MK ρ 2 ji)y +1 ω j K = α +1 ω j,+1 α β2 +1 The same type of convergence theory can be established λ j ) 2 9

10 Theorem 4 Let λ 2 1 λ2 2 λ2 n > 0 be the eigenvalues of MK with λ j > 0 for j = 1,, n, η 1,, η n be the corresponding M-orthonormal left eigenvectors associated with MK, and by Lemma 2 (c), ξ j = λ 1 j Mη j, j = 1,, n be the corresponding K-orthonormal right eigenvectors Suppose ρ 1 ρ are the singular values of B β e, ζ1,, ζ the corresponding orthonormal left singular vectors, and ω 1,, ω the corresponding orthonormal right singular vectors, as defined in (21) Let γ j, γ j, π j, and π j be defined in Theorem 3, and x 1 = Ky 1 / Ky 1 M be generated by Algorithm 1 Then for j = 1,,, with and with 0 λ 2 j ρ 2 j (λ 2 1 λ 2 n) κ 1, = 1; κ j, = j 1 i=1 ( ) κj, tan θ M (x 1, η j ) 2 (23) C j (1 + 2γ j ) ρ 2 i λ2 n ρ 2 i, j > 1; λ2 j ( ) 0 ρ 2 j λ 2 n +j (λ2 1 λ 2 κj, tan θ M (x 1, η n +j ) 2 n) (24) C j 1 (1 + 2 γ j ) κ, = 1; κ j, = n i=n +j+1 ρ 2 i λ2 1 ρ 2 i, j < ; λ2 n +j The corresponding Ritz vectors of M K have the following bounds: ( ) 2 ( ) 2 ρj sin 2 ρj θ K (Y +1ω j, ξ j ) + 1 λ j λ j with ɛ j = min i j λ 2 j ρ2 i, = sin θ M (X ζ j, η j ) π j 1 + (α +1 β ) 2 /ɛ 2 j C j (1 + 2γ j ) sin θ M (x 1, η j ) (25) ( ρj λ n +j ) 2 ( ) 2 sin 2 ρj θ K (Y +1ω j, ξ n +j) + 1 λ n +j = sin θ M (X ζ j, η n +j ) with ɛ j = min i j λ 2 n +j ρ2 i π j 1 + (α +1 β ) 2 / ɛ 2 j C j 1 (1 + 2 γ j ) sin θ M (x 1, η n +j ) (26) Proof The bounds can be derived by applying the standard Lanczos convergence results to the second relation in (12) By (22), cos θ K (ξ j, Y +1 ω j ) = λ 1 j and with which the equality in (25) can be derived ρ 1 j ηj T MK(MX ζ j ) = λ j ηj T MX ζ j = λ j cos θ M (X ζ j, η j ), ρ j ρ j 10

11 Similarly, one has cos θ K (ξ n +j, Y +1 ω j ) = λ n +j ρ j cos θ M (X ζ j, η n +j ), which establishes the equality in (26) Remar 1 Since σ 1,, σ are the singular values of B and ρ 1,, ρ are the singular values of B β e, from the interlacing properties (9, Theorem 862) one has ρ 1 σ 1 ρ 2 σ 2 ρ σ From Theorems 3 and 4, for approximating large eigenvalues of MK it is better to use ρ 2 j while for small eigenvalues it is better to use σ 2 j For instance, if we need to approximate λ2 1, ρ 2 1 is more accurate than σ2 1, and if we need to approximate λ2 n, then σ 2 is better than ρ2 Remar 2 Theorems 3 and 4 provide the convergence results for both the left and right eigenvectors of MK as well as KM = (MK) T The quantities sin θ K (y 1, ξ j ) and sin θ M (x 1, η j ) represent the influence of the initial vector y 1 to the approximated eigenvalues and eigenvectors In general, the angles θ K (y 1, ξ j ) and θ M (x 1, η j ) are different, but they are close Recall, x 1 = Ky 1 / Ky 1 M, η j = λ 1 j Kξ j and MKξ j = λ 2 j ξ j So cos θ M (x 1, η j ) = x T 1 Mη j = yt 1 KMKξ j λ j Ky 1 M = Because λ j Ky 1 M y T 1 Kξ j = λ j Ky 1 M cos θ K (y 1, ξ j ) Ky 1 2 M = y T 1 KMKy 1 = (L T y 1 ) T (L T ML)(L T y 1 ), (L T y 1 ) T (L T y 1 ) = y T 1 Ky 1 = 1, and λ 2 1,, λ2 n are the eigenvalues of L T ML, one has λ n Ky 1 M λ 1 Therefore λ j λ 1 cos θ K (y 1, ξ j ) cos θ M (x 1, η j ) λ j λ n cos θ K (y 1, ξ j ) (27) When the matrix B in (7) is lower bidiagonal, a corresponding lower bidiagonal version of weighted Golub-Kahan-Lanczos algorithm (WGKLL) can be easily derived by interchanging the roles of K and M, X and Y in (7) In this case, in order to avoid confusion we use X, Ỹ, B instead of X, Y, B in (7) and we have KỸ = X B, M X = Ỹ B T with B = α 1 β 1 β n 1 α n (28) The K-orthogonal matrix Ỹ and the M-orthogonal matrix X and the entries of B can be computed by the following algorithm 11

12 Algorithm 2 (WGKLL) Choose x 1 satisfying x 1 M = 1, and set β 0 = 1, ỹ 0 = 0 Compute g 1 = M x 1 For j = 1, 2, s j = g j / β j 1 β j 1 ỹ j 1 f j = Ks j α j = (s T j f j) 1 2 ỹ j = s j / α j t j+1 = f j / α j α j x j g j+1 = Mt j+1 β j = (t T j+1 g j+1) 1 2 x j+1 = t j+1 / β j End Let X j = x 1 x j, Ỹj = ỹ 1 ỹ j, Bj = α 1 β 1 β j 1 α j Then one has KỸ = X B + β x +1 e T = X +1 B β e T, M X = Ỹ B T and Then, X T M X = I = Ỹ T KỸ KM X = X B BT + α β x +1 e T, MKỸ = Ỹ( B T B + β 2 e e T ) + α +1 β ỹ +1 e T, and range X = K (KM, x 1 ), range Ỹ = K (MK, M x 1 ) = MK (KM, x 1 ) If we use the singular values and vectors of B B and β e T to approximate the eigenvalues and eigenvectors of MK and KM, convergence theory as Theorems 3, 4 can be established in the same way In the rest of this section we discuss the relations between the two algorithms Denote U = R T X, V = L T Y, Ũ = RT X, Ṽ = L T Ỹ, all of which are orthogonal matrices, where Ũ and Ṽ satisfy Ũ T R T LṼ = B Note from (5), one has Hence R T L = UBV T = Ũ BṼ T Ũ T UB = BṼ T V If we choose y 1 and x 1 = x 1 = Ky 1 / Ky 1 M, then the first columns of Ũ and U are identical, or the first column of Ũ T U is e 1 Since Ũ T UBB T (Ũ T U) T = B B T is a tridiagonal reduction 12

13 of BB T, if all β j, α j, β j, α j are positive, by the implicit-q Theorem (9), Ũ T U = I, ie, U = Ũ, or equivalently X = X Then B = BQ with Q = V T Ṽ is an RQ factorization of the lower bidiagonal matrix B Hence, when WGKLU starts with y 1 and WGKLL starts with x 1 = Ky 1 / Ky 1 M, if both algorithms can run n iterations, the generated matrices satisfies X = X, Ỹ = Y Q Since BT B = Q T B T BQ, it is not difficult to see that Q is just the orthogonal matrix generated by running one QR iteration from B T B to B T B with zero shift (9) Clearly, one has BB T = B B T For any integer 1 n, by comparing the leading principal submatrices of BB T and B B T, one has B β e B β e T = B BT So the singular values of B and B β e are identical Now we have four blocs B β e T, B, B β e, B and the singular values of each can be used for eigenvalue approximations Following the B same arguments given in Remar 1, the squares of large singular values of β e T are closer to the large eigenvalues of MK than those of B So they are also closer than those of B β e and B Similarly, the squares of small singular values of B are the closest to the small eigenvalues of MK among those of the above four blocs Similarly, when WGKLL starts with x 1 and WGKLU starts with y 1 = ỹ 1 = M x 1 / M x 1 K, we have Ỹ = Y and X = X Q with orthogonal matrix Q satisfying B = Q B It can be interpreted that Q is obtained by performing one QR iteration on B B T with zero shift In this case, among the above four blocs B β e, will provide the best approximations to the large eigenvalues of MK and B will provide the best approximations to the small eigenvalues of M K 4 Application to linear response eigenvalue problem Consider the matrix H = 0 M K 0, 0 < K, M R n n (29) The eigenvalue problem of such a matrix arises from linear response problem (2, 3, 5, 11, 14, 19) In practice, the extreme eigenvalues of H, in particular a few small positive eigenvalues, are of interest From (7) and the fact that X is M-orthogonal and Y is K-orthogonal, for X = Y 0 0 X K 0, K = 0 M, 13

14 one has 0 B T HX = XB, B = B 0 ; X T KX = I 2n So H is similar to the symmetric matrix B with an K-orthogonal transformation matrix X Moreover, suppose B = ΦΛ Ψ T is an SVD of B Define the symmetric orthogonal matrix P n = 1 In I n 2 I n I n Then Y Ψ 0 H 0 X Φ Y Ψ 0 P n = 0 X Φ P n Λ 0 0 Λ Hence, ±λ 1,, ±λ n are the eigenvalues of H From (5) and (2), one has (30) Φ = U Φ, Ψ = V Ψ Following (4) and (6), one has Y Ψ = L T Ψ = W =: ξ 1 ξ n, X Φ = R T Φ = KW Λ 1 =: η 1 η n By Lemma 2, ξ 1,, ξ n form a complete set of K-orthonormal right eigenvectors and η 1,, η n form a complete set of M-orthonormal left eigenvectors of MK corresponding to the eigenvalues λ 2 1,, λ2 n, as given in both Theorems 3 and 4 Hence Y Ψ 0 0 X Φ P n = 1 2 ξ1 ξ 1 ξ 1 ξ n η 1 η n η 1 η n =: x + 1,, x+ n, x 1, x n and x ± j, j = 1,, n form a complete set of K-orthonormal right eigenvectors of H corresponding to the eigenvalues ±λ j, for j = 1,, n Since H T 0 In 0 In = H, (31) I n 0 I n 0 one can show that y j ± := 1 ±η T 2 j ξj T T, j = 1,, n form a complete set of M-orthonormal M 0 left eigenvectors of H corresponding to the eigenvalues ±λ j, where M = 0 K In this section we apply the algorithms WGKLU and WGKLL to solve the eigenvalue problem of H We only consider WGKLU, since the results about WGKLL parallel to the following results about WGKLU can be established in the same way Let X j, Y j, B j be generated by Algorithm 1 after j iterations Define Yj 0 X j = 0 X j, B j = 0 B T j B j 0, Then from (8), Let HX = X B + β y+1 0 P = e 1 e +1 e 2 e +2 e e 2 e T 2 (32) 14

15 Then H(X P ) = (X P )( P T B P ) + β y+1 0 e T 2, (33) where P T B P is a symmetric tridiagonal matrix with zero diagonal entries and y1 0 y X P = 2 0 y 1 0 y 0 0 x 1 0 x 2 0 x 1 0 x Using (11), one has y1 range X = range X P = K 2 (H, 0 ) (34) So, running iterations of WGKLU is just the same as running 2 iterations of weighted Lanczos algorithm with H and an initial vector of a special form y1 T 0 T Suppose B has an SVD (13) with σ 1 σ 2 σ > 0 From (31) and (32), we may tae ±σ 1,, ±σ as Ritz values of H, v j ± = 1 2 Y ν j ±X µ j as corresponding K-orthonormal right Ritz vectors, and u ± j = 1 2 ±X µ j Y ν j as corresponding M-orthonormal left Ritz vectors From (32), for any j {1,, }, Hv ± j = ±σ j v ± j + β µ j 2 y+1 0, j = 1,,,, j = 1,,, H T u ± j = ±σ ju ± j + β µ j 2 where µ j = e T µ j In practice, we may use the residual norm 0 y +1 Hv + j σ jv + j K = H T u + j σ ju + j M = 1 2 MX µ j σ j Y ν j K = β µ j 2 (35) to design a stopping criterion Remar 3 For (33), if (θ i, g i ), i = 1 : 2 are eigenpairs of P T B P, ie, P T B P g i = θ i g i, here g i is orthonormal, then (θ i, q i ), i = 1 : are approximate eigenpairs of H, where q i = X P g i, and Hq i θ i q i K = β y+1 0, e T 2 g i K = β g i (36) 15

16 The convergence results can be established by using Theorems 3 Theorem 5 Let γ j, γ j, π j, π j, π j,, π j,, δ j, and δ j be defined as in Theorem 3 Then for j = 1,, 0 λ j σ j = ( σ j ) ( λ j ) λ2 1 ( ) λ2 n πj, tan θ K (y 1, ξ j ) 2, λ j + σ j C j (1 + 2γ j ) 0 σ j λ n +j = ( λ n +j ) ( σ j ) λ2 1 ( ) λ2 n πj, tan θ K (y 1, ξ n +j ) 2 ; λ n +j + σ j C j 1 (1 + 2 γ j ) and for the Ritz vectors one has the bounds, ) ( ) sin θ K (v j ±,, x± j = sin θ M u ± j, y± j 1 π2 j (1 + (α β ) 2 /δj 2) cos ϱ j C j 2 (1 + 2γ sin 2 θ K (y 1, ξ j ) sin 2 ϱ j j) ) ( ) sin θ K (v j ±, x± n +j = sin θ M u ± j, y± n +j π sin2 ϱ j + cos 2 j 2 ϱ (1 + (α β ) 2 / δ j 2) j Cj 1 2 (1 + 2 γ sin 2 θ K (y 1, ξ n +j ), j) where ϱ j = arccos 2σ j λ j + σ j, ϱ j = arccos σ j + λ n +j 2σ j Proof The first two bounds are simply from (15) and (16) For the last two relations, it is trivial to have the equalities So we only need to prove the upper bounds Following (19) and the fact that ξj T KY ν j and ηj T MX µ j have the same sign, which is shown right above (19), ) cos θ K (v j ±, x± j = 1 2 ξt j KY ν j + ηj T MX µ j = 1 2 (cos θ K(Y ν j, ξ j ) + cos θ M (X µ j, η j )) = λ j + σ j 2σ j cos θ K (Y ν j, ξ j ) Since 0 2σ j /(λ j + σ j ) 1, cos ϱ j = one has 2σ j λ j +σ j is well defined Then from cos θ K (Y ν j, ξ j ) = cos ϱ j cos θ K (v ± j, x± j sin 2 θ K (Y ν j, ξ j ) = 1 cos 2 ϱ j cos 2 θ K (v ± j, x± j ), ) ( ) = sin 2 ϱ j + cos 2 ϱ j sin 2 θ K v j ±, x± j Hence ) sin θ K (v j ±, x± j = sin 2 θ K (Y ν j, ξ j ) sin 2 ϱ j / cos ϱ j The third bound then follows from (17) The last bound can be proved in the same way, by using the relation (20) 16

17 Remar 4 It is straightforward to show that the relation (33) is equivalent to 0 (R T L) T R T (Z L 0 P ) = (Z P )( P T B P v+1 ) + β e T 0 2, Z V 0 = 0 U with V = L T Y, U = R T X, Z T Z = I 2, and v +1 = L T y +1 This is a relation resulting in the standard symmetric Lanczos algorithm with the initial vector v T 1 0 T, v1 = L T y 1 So we can establish the following convergence results: For j = 1,,, where ( ˆπj, tan θ K (y 1, ξ j ) 0 λ j σ j = ( σ j ) ( λ j ) 2λ 1 C 2 j (1 + 2ˆγ j ) ) sin θ K (v j ±, x± j ˆπ j 1 + (α β ) 2 /ˆδ j 2 sin θ K (y 1, ξ j ), C 2 j (1 + ˆγ j ) ) 2 ˆγ j = λ j 1 j λ j+1 j 1 σ i + λ 1 λ i + λ 1, ˆπ 1, = ˆπ 1 = 1, ˆπ j, =, ˆπ j =, λ j+1 + λ 1 σ i=1 i λ j λ i=1 i λ j ˆδj = min i j λ j σ i However, it seems that it is not trivial to derive a bound for σ j λ n +j, since the small positive eigenvalues of H are the interior eigenvalues In 24, another type of Lanczos algorithms was proposed for solving the eigenvalue problem of H The algorithm is based on the factorizations KU = V T, MV = UD, U T V = I n with the assumption that M > 0, where T is symmetric tridiagonal and D is diagonal, so that U 0 U 0 0 D H = 0 V 0 V T 0 The first Lanczos-type algorithm in 24 computes the columns of U, V and the entries of D and T by enforcing the columns of V to be unit vectors By running iterations with the first column of V as an initial vector, the leading principal submatrices D and 0 D T of D and T, respectively, are computed Then the eigenvalues of are used to T 0 approximate the eigenvalues of H This algorithm wors even when K is indefinite On the other hand, when K > 0, Algorithms 1 and 2 exploit the symmetry of the problem and seem more natural 5 Numerical examples In this section, we test Algorithm 1 (WGKLU) with several numerical examples for solving the eigenvalue problem of H, where the initial vector y 1 is randomly selected satisfying y 1 K = 1 The numerical results are labeled with WGKLU For comparison we tested the first algorithm presented in 24 with the initial vector y 1 / y 1 The numerical results are labeled with Alg-TL We also ran the weighted Lanczos algorithm based on the relation (33) with H 17

18 being treated as a full matrix and X being a K-orthonormal matrix The initial vector is y T 1 0 T The numerical results are labeled with Alg-Full For each example we ran iterations with Alg-Full using a given even number and /2 iterations with WGKLU and Alg-TL For each numerical example the matrices M and K were scaled with a positive diagonal matrix S = diag ( 4 11 m 11,, ) 4 nn m nn so that all the corresponding diagonal entries of S 1 KS 1 and SMS are identical The Lanczos-type algorithms were then applied to the scaled symmetric positive definite matrix pairs, or equivalently, the diagonally scaled matrix S 0 S 1 0 H := 0 S 1 H, 0 S which is similar to H For each example we computed the two largest and the two smallest positive eigenvalues of H (and H) We report the relative errors e(σ j ) := λ j σ j /λ j for j = 1, 2, e(σ j ) := λ n +j σ j /λ n +j for j = 1,, and the residual norms r(σ j ) := Hv j + σ j v j + K, for j = 1, 2, 1, For WGKLU and Alg-Full, we use (35) and (36) as the formula of residual norm respectively And for Alg-TL, we compute the above residual norms directly The exact eigenvalues λ j, j = 1, 2, 1,, are computed with MATLAB code eig All the numerical testing results are computed by using Matlab 70 on a computer with an Intel Core i5 23GHz and a 4GB memory Example 1 In this example, we tested the three algorithms with an artificially constructed linear response eigenvalue problem given in 24 The matrices K and M are extracted from the University of Florida sparse matrix collection 6: K is fv1 with n = 9604, and M is the n n leading principal submatrix of finan512 Both K and M are symmetric positive definite The two smallest eigenvalues are , , and the two largest ones are , We set = 200 for the smallest positive eigenvalues and = 50 for the largest eigenvalues The testing results associated with the two smallest positive eigenvalues are shown in Figure 1, and the results associated with the two largest eigenvalues are shown in Figure 2 For the two smallest positive eigenvalues, WGKLU runs about 125 seconds, Alg-Full about 132 seconds, and Alg-TL about 471 seconds For the two largest eigenvalues the time is about 028, 030, 046 seconds, respectively Alg-TL needs to compute the eigenvalues of 0 D, which is treated as a general nonsymmetric matrix This is the part that slows T 0 Alg-TL down On the other hand, the numerical results of Alg-Full is less accurate than the other two algorithms Example 2 In this example, we tested the problem with positive definite matrices of K and M given in 24 about sodium dimer Na2 The order n = 1862 The two smallest eigenvalues are , , and the two largest ones are , We set = 150 for the smallest positive eigenvalues and = 20 for the largest eigenvalues The numerical results are reported in Figures 3 and 4, respectively, associated with the 18

19 10 1 Residual K norm 10 1 Relative eigenvalue errors r(σ n r(σ n 1 r(σ n r(σ n 1 r(σ n r(σ n Lanczos step e(σ n e(σ n 1 e(σ n e(σ n 1 e(σ n e(σ n Lanczos step Figure 1: Error and convergence with the smallest 2 positive eigenvalues in Example Residual K norm 10 0 Relative eigenvalue errors r(σ 1 r(σ 2 r(σ 1 r(σ 2 r(σ 1 r(σ Lanczos step e(σ 1 e(σ 2 e(σ 1 e(σ 2 e(σ 1 e(σ Lanczos step Figure 2: Error and convergence with the largest 2 eigenvalues in Example Residual K norm 10 2 Relative eigenvalue errors r(σ n r(σ n 1 r(σ n r(σ n 1 r(σ n r(σ n Lanczos step e(σ n e(σ n 1 e(σ n e(σ n 1 e(σ n e(σ n Lanczos step Figure 3: Error and convergence with the smallest 2 positive eigenvalues in Example 2 two smallest positive and two largest eigenvalues For the two smallest positive eigenvalues, WGKLU runs about 55 seconds, Alg-Full about 158 seconds, and Alg-TL about 243 seconds 19

20 10 2 Residual K norm 10 0 Relative eigenvalue errors r(σ 1 r(σ 2 r(σ 1 r(σ 2 r(σ 1 r(σ Lanczos step e(σ 1 e(σ 2 e(σ 1 e(σ 2 e(σ 1 e(σ Lanczos step Figure 4: Error and convergence with the largest 2 eigenvalues in Example 2 For the two largest eigenvalues the time is about 057, 165, 159 seconds, respectively As in Example 1, WGKLU and Alg-TL provide the numerical results with comparable accuracy, whereas Alg-Full provides less accurate results 6 Connection with conjugate gradient methods Consider the system of linear equations Mz = b, M > 0 (37) Let z 0 be an initial guess of the solution z e = M 1 b and r 0 = b Mz 0 = M(z e z 0 ) be the corresponding residual Assume that X, Y and B are computed by WGKLU with M and another positive definite matrix K and y 1 = r 0 / r 0 K Then they satisfy (8) and (11) We approximate the solution z e with a vector z z 0 + KK (MK, y 1 ) for 1 n From (11), we may express z = z 0 + X w for some w R We tae the approximation z (or equivalently w ) as the minimizer of the minimization problem Since min w J(w ), J(w ) = ε T Mε, ε = z e z = ε 0 X w J(w ) = w T XT MX w 2w T XT Mε 0 + ε T 0 Mε 0 = w T XT MX w 2w T XT r 0 + ε T 0 Mε 0, J(w ) is minimized when w satisfies X T MX w = X T r 0 Note that X T MX = I Using r 0 = r 0 K y 1, Y T KY = I, and the first relation of (8), one has X T r 0 = r 0 K (KY B 1 )T y 1 = r 0 K B T Y T Ky 1 = r 0 K B T e 1 20

21 Hence the minimizer is z = z 0 + X w, with w = r 0 K B T e 1 The vector w can be computed in an iterative way following the iterations of WGKLU Note that α 1 B T = β 1 α 2 = B 1 T 0 β 1 e T 1 α So B T = β 1 α B T 1 0 β 1 α e T 1 B T 1 α 1 and denoting w = ϕ 1 T ϕ, one has w = r 0 K B T e 1 = r 0 K where ϕ follows the iteration β 1 α B T 1 e 1 e T 1 B T 1 e 1 = w 1 ϕ, Therefore, ϕ = β 1 e T 1 α w 1 = β 1 ϕ 1, 1; β 0 = 1, ϕ 0 = r 0 K (38) α z = z 0 + X w = z 1 + ϕ x, 1, and using B T w = r 0 K e 1 and the second relation in (8), the corresponding residual is r = b Mz = r 0 MX w = r 0 (Y B T + β y +1 e T )w = r 0 r 0 K Y e 1 β ϕ y +1 = β ϕ y +1, 0, So we have the following algorithm for solving (37) Algorithm 3 (WGKLULin) Choose z 0 and compute r 0 = b Mz 0, ϕ 0 = r 0 K, and y 1 = r 0 / r 0 K Set β 0 = 1, x 0 = 0 Compute g 1 = Ky 1 For j = 1, 2, s j = g j /β j 1 β j 1 x j 1 f j = Ms j α j = (s T j f j) 1 2 x j = s j /α j ϕ j = β j 1 ϕ j 1 /α j z j = z j 1 + ϕ j x j t j+1 = f j /α j α j y j g j+1 = Kt j+1 β j = (t T j+1 g j+1) 1 2 y j+1 = t j+1 /β j r j = ϕ j t j+1 End 21

22 We now show this algorithm is equivalent to a weighted conjugate gradient (CG) method By introducing the vectors p 1 = α 2 ϕ x, 1 with Then Since using (38) one has p 0 = α 2 1ϕ 1 x 1 = α 1 ϕ 1 Ky 1 = r 0 K Ky 1 = Kr 0 p T 1 Mp 1 = α 4 ϕ2 xt Mx = α 4 ϕ2 r = β ϕ y +1, r T Kr = β 2 ϕ2 yt +1 Ky +1 = β 2 ϕ2 = α2 +1 ϕ2 +1 We then have Now, α 2 +1 = α4 +1 ϕ2 +1 α 2 +1 ϕ2 +1 = pt Mp r T Kr, and z = z 1 + ϕ x = z 1 + α 2 p 1 = z 1 + γ 1 p 1, γ 1 = α 2 = rt 1 Kr 1 p T 1 Mp, 1 By multiplying α +1 ϕ +1 to the equation and using (38) one has Since we have r = b Mz = r 1 γ 1 Mp 1 Ky +1 = β x + α +1 x +1, p = α 2 +1 ϕ +1x +1 = α +1 β ϕ +1 x + α +1 ϕ +1 Ky +1 = β 2 ϕ x β ϕ Ky +1 = β2 α 2 p 1 + Kr ϑ 1 := β2 α 2 = β2 ϕ2 α 2 ϕ2 = rt Kr r 1 T Kr, 1 p = Kr + ϑ 1 p 1, ϑ 1 = rt Kr r 1 T Kr 1 So we have the simplified version of Algorithm 4 Algorithm 4 (WCG) Choose z 0 and compute r 0 = b Mz 0 and p 0 = Kr 0 For j = 0, 1, 2, γ j = r T j Kr j/p T j Mp j z j+1 = z j + γ j p j 22

23 End r j+1 = r j γ j Mp j ϑ j = r T j+1 Kr j+1/r T j Kr j p j+1 = Kr j+1 + ϑ j p j We call Algorithm 4 the weighted CG algorithm, since for K = LL T if we set r j = L T r j, p j = L 1 p j, z j = L 1 z j, then it is not difficult to see that the above algorithm is nothing but the standard CG algorithm applied to the symmetric linear system A z = b, A = L T ML, z = L 1 z, b = L T b In particular, when K = I, Algorithm 4 is just the standard CG Therefore, the algorithm WGKLU provides a simple way connecting to the standard CG algorithm compared to the common symmetric Lanczos algorithm connection In this case the residual vectors r 0,, r 1 are orthogonal and they are parallel to the columns of Y The search direction vectors p 0,, p 1 are M-orthogonal and they are parallel to columns of X WGKLU just provides an iterative way to compute both X and Y Similarly, the algorithm WGKLL can be employed to solve (37) as well 7 Conclusions We proposed two weighted Golub-Kahan-Lanczos bidiagonalization algorithms associated with two symmetric positive definite matrices K and M The algorithms can be implemented 0 M naturally to solve the large scale eigenvalue problems of MK and the matrix H = K 0 For these eigenvalue solvers, convergence results have been established We also showed a simple connection between the algorithm WGKLU and the standard CG method The proposed algorithms are still not ready for practical use For instance, it is not clear how weighted orthogonality will affect convergence behavior, in particular in the presence of rounding errors In order to mae the algorithms more efficient and capable of dealing with different ind of related problems, other techniques have to be incorporated All these require further investigations Acnowledgment The second author was supported in part by China Scholarship Council (CSC) during a visit at Department of Mathematics, University of Kansas, and the National Science Foundation of China under grant References 1 M Arioli, Generalized Golub-Kahan bidiagonalization and stopping criteria, SIAM J Matrix Anal Appl, 34: , Z-J Bai and R-C Li, Minimization principles for the linear response eigenvalue problem I: Theory, SIAM J Matrix Anal Appl, 33: , Z-J Bai and R-C Li, Minimization principles for the linear response eigenvalue problem II: Computation, SIAM J Matrix Anal Appl, 34: ,

24 4 S J Benbow, Solving generalized least-squares problems with LSQR SIAM J Matrix Anal Appl 21: , M E Casida, Time-dependent density-functional response theory for molecules, in Recent Advances in Density Functional Methods, D P Chong, ed, World Scientific, Singapore, pp: , T Davis and Y Hu, The university of Florida sparse matrix collection, ACM Trans Math Software, 38: 1 25, A Essai, Weighted FOM and GMRES for solving nonsymmetric linear systems, Numer Algorithm, 18: , G H Golub and W Kahan, Calculating the singular values and pseudo-inverse of a matrix, J SIAM Ser B Numer Anal, 2 :205224, G H Golub and C F Van loan, Matrix Computations 3rd Edition Johns Hopins University Press, Baltimore, MD, M Grüning, A Marini, and X Gonze, Exciton-plasmon states in nanoscale materials: Breadown of the Tamm-Dancoff approximation, Nano Letters, 9: , M J Lucero, A M N Nilasson, S Tretia, and M Challacombe, Molecular-orbitalfree algorithm for excited states in time-dependent perturbation theory, J Chem Phys, 129:064114, M T Lus and A E Mattsson, High-performance computing for materials design to advance energy science, MRS Bulletin, 36: , C Mehl, V Mehrmann, and H Xu, On doubly structured matrices and pencils that arise in linear response theory, Linear Algebra Appl, 380:3 51, G Onida, L Reining, and A Rubio, Electronic excitations: Density-functional versus many-body Green s function approaches, Rev Modern Phys, 74: , C C Paige, Bidiagonalization of matrices and solution of linear equations, SIAM J Numer Anal, 11: , C C Paige and M Saunders, LSQR: An algorithm for sparse linear equations and sparse least squares, ACM Trans Math Software, 8:43 71, P Papaonstantinou, Reduction of the RPA eigenvalue problem and a generalized Cholesy decomposition for real-symmetric matrices, EPL (Europhys Lett), 78:12001, B N Parlett, The Symmetric Eigenvalue Problem SIAM, Philadelphia, D Rocca, Time-dependent density functional perturbation theory: new algorithms with applications to molecular spectra PhD Thesis, The International School for Advanced Studies, Trieste, Italy,

25 20 D Rocca, D Lu, and G Galli, Ab initio calculations of optical absorpation spectra: Solution of the Bethe-Salpeter equation within density matrix perturbation theory, J Chem Phys, 133:164109, Y Saad, Iterative Methods for Sparse Linear Systems 2nd Edition SIAM, Philadelphia, Y Saad, Numerical Methods for Large Eigenvalue Problems 2nd Edition SIAM, Philadelphia, Y Saad, J R Cheliowsy, and S M Shontz, Numerical methods for electronic structure calculations of materials, SIAM Rev, 52:3 54, Z-M Teng, and R-C Li, Convergence analysis of Lanczos-type methods for the linear response eigenvalue problem, J Comput Appl Math, 247:17 33, E V Tsiper, A classical mechanics technique for quantum linear response, J Phys B: At Mol Opt Phys, 34:L401 L407,

The Lanczos and conjugate gradient algorithms

The Lanczos and conjugate gradient algorithms Gérard MEURANT October, 2008 1 The Lanczos algorithm 2 The Lanczos algorithm in finite precision 3 The nonsymmetric Lanczos algorithm 4 The Golub Kahan bidiagonalization