S-Step and Communication-Avoiding Iterative Methods

Size: px
Start display at page:

Download "S-Step and Communication-Avoiding Iterative Methods"

Transcription

1 S-Step and Communication-Avoiding Iterative Methods Maxim Naumov NVIDIA, 270 San Tomas Expressway, Santa Clara, CA Abstract In this paper we make an overview of s-step Conjugate Gradient and develop a novel formulation for s-step BiConjugate Gradient Stabilized iterative method. Also, we show how to add preconditioning to both of these s-step schemes. We explain their relationship to the, block and communication-avoiding counterparts. Finally, we explore their advantages, such as the availability of matrix-power kernel A k x and use of block-dot products B = X T Y that group individual dot products together, as well as their drawbacks, such as the extra computational overhead and numerical stability related to the use of monomial basis for Krylov subspace K s = {r, Ar,..., A s r}. We note that the mathematical techniques used in this paper can be applied to other methods in sparse linear algebra and related fields, such as optimization. Introduction In this paper we are concerned with investigating iterative methods for the solution of linear system Ax = f () where nonsingular coefficient matrix A R n n, right-hand-side (RHS) f and solution x R n. A comprehensive overview of iterative methods can be found in [, 7]. In this brief study we will focus on two popular algorithms: (i) Conjugate Gradient (CG) and (ii) BiConjugate Gradient Stabilized (BiCGStab) Krylov subspace iterative methods. These algorithms are often used to solve linear systems with symmetric positive definite (SPD) and nonsymmetric coefficient matrices, respectively. NVIDIA Technical Report NVR , April 206. c 206 NVIDIA Corporation. All rights reserved. This research was developed, in part, with funding from the Defense Advanced Research Projects Agency (DARPA). The views, opinions, and findings contained in this article are those of the author and should not be interpreted as representing the official views or policies of the Department of Defense or the U.S. Government.

2 The CG and BiCGStab methods have already been generalized to their block counterparts in [, 5]. The block methods work with s vectors simultaneously and were originally designed to solve linear systems with multiple RHS, such as AX = F (2) where multiple RHS and solutions are column vectors in tall matrices F and X R n s, respectively. However, they can be adapted to solve systems with a single RHS, by for example randomly selecting the remaining s RHS [4]. In the latter case, the block methods often perform less iterations than their counterparts, but they do not compute exactly the same solution in i iterations as the methods in i s iterations. The s-step methods occupy a middle ground between their and block counterparts. They are usually applied to linear systems with a single RHS, they work with s vectors simultaneously, but they are also designed to produce identical solution in i iterations as the methods in i s iterations, when the computation is performed in exact arithmetic. In fact, the s-step methods can be viewed as an unrolling of s iterations and a clever re-grouping of recurrence terms in them. The s-step methods were first investigated by J. Van Rosendale, A. T. Chronopoulos and C. W. Gear in [4, 6, 6], where the authors proposed s-step CG algorithm for SPD linear systems. Subsequent work by A. T. Chronopoulos and C. D. Swanson in [5, 7] lead to the development of s-step methods, such as GMRES and Orthomin, but not BiCGStab, for nonsymmetric linear systems. These methods were further improved and generalized to the solution of eigenvalue problems using Lanczos and Arnoldi iterations in [8, 9]. There are several advantages for using s-step methods. The first advantage is that the matrix-vector multiplications used to build the Krylov subspace K s = {r, Ar,..., A s r} can be performed together, one immediately after another. This allows us to develop a more efficient matrix-vector multiplication, using a pipeline where the results from current multiplication are immediately forwarded as inputs to the next. The second advantage is that the dot products spread across multiple iterations are bundled together. This allows us to minimize fan-in communication that is associated with dot-products and often limits scalability of parallel platforms. There are also some disadvantages for using s-step methods. We perform more work per iteration of an s-step method than per s iterations of a method. The extra work is often the result of one extra matrix-vector multiplication required to recompute the residual. Also, notice that in practice the vectors computed by matrix-power kernel A k x for k = 0,..., s quickly become linearly dependent. Therefore, s-step methods can suffer from issues related to numerical stability resulting from the use of a monomial basis for the Krylov subspace corresponding to s internal iterations. In this paper we first make an overview of how to use properties of the directions and residual vectors in CG to build the s-step CG iterative method. Then, we find the properties of the directions and residual vectors in BiCGStab. Finally, we use this information in conjuction with recurrence relationships for these vectors to develop a novel s-step BiCGStab iterative method. We also show how to add preconditioning to both of these s-step schemes. 2

3 Finally, we point out that s-step iterative methods are closely related to communicationavoiding algorithms proposed in [2, 3, 0, 2]. In fact, the main differences between these methods lie in (i) a different basis used by the communication-avoiding algorithms to address numerical stability, and (ii) an efficient (communication-avoiding) implementation of matrix-power kernel A k x as well as distributed dense linear algebra operations, such as dense QR-factorization used in GMRES [8]. Let us now show how to derive the well known s-step CG and novel s-step BiCGStab iterative methods as well as how to add preconditioning to both algorithms. 2 S-Step Conjugate Gradient (CG) The CG method is shown in Alg.. Algorithm Standard CG : Let A R n n be a SPD matrix and x 0 be an initial guess. 2: Compute r 0 = f Ax 0 3: for i = 0,,...until convergence do 4: if i == 0 then Find directions p 5: Set p i = r i 6: else 7: Compute β i = rt i r i r T i r Dot-product i 8: Compute p i = r i + β i p i 9: end if 0: Compute q i = Ap i Matrix-vector multiply : Compute α i = rt i r i r T i q Dot-product i 2: Compute x i+ = x i + α i p i 3: Compute r i+ = r i α i q i 4: if r i+ / tol then stop end Check Convergence 5: end for Let us now prove that the directions p i and residuals r i computed by CG satisfy a few properties. These properties will subsequently be used to setup the s-step CG method. Lemma. The residuals r i are orthogonal and directions p i are A-orthogonal Proof. See Proposition 6.3 in [7]. r T i r j = 0 and p T i Ap j = 0 for i j (3) Theorem. The current residual r i and all previous direction p j are orthogonal r T i p j = 0 for i > j (4) 3

4 Proof. Using induction. Notice that for the base case i = we have r T p 0 = r T r 0 = 0 (5) For the induction step, let us assume r T i p j = 0. Then r T i p j = r T i (r j + β j p j ) = β j r T i p j = 0 (6) Let us now derive an alternative expression for scalars α i and β i. Corollary. The scalars α i and β i satisfy Proof. Using Lemma and Theorem we have and (p T i Ap i )α i = p T i r i (7) (p T i Ap i )β i = p T i Ar i+ (8) r T i r i+ = r T i (r i α i Ap i ) = (p i β i p i ) T r i α i (p i β i p i ) T Ap i = p T i r i α i p T i Ap i = 0 (9) p T i Ap i+ = p T i A(r i+ + β i Ap i ) = p T i Ar i+ + β i p T i Ap i = 0 (0) Let us now assume that we have s directions p i available at once, then we can write x i+s = x i + α i p i α i+s p i+s = x i + P i a i () where P i = [p i,..., p i+s ] R n s is a tall matrix and a i = [α i,..., α i+s ] T R s is a small vector. Consequently, residual can be expressed as r i+s = f Ax i+s = r i AP i a i (2) Also, let us assume that we have the monomial basis for Krylov subspace for s iterations R i = [r i, Ar i,..., A s r i ] (3) Then, following the CG algorithm, it would be natural to express the next directions as a combination of previous directions and the basis vectors P i+ = R i+ + P i B i (4) where B i = [β i,j ] R s s is a small matrix. At this point we have almost all the building blocks of s-step CG method, but we are still missing a way to find scalars α and β stored in the vector a i and matrix B i. These coefficients can be found using the following Corollary. 4

5 Corollary 2. The vector a T i = [α i,..., α i+s ] and matrix B i = [β i,j ] satisfy (Pi T AP i )a i = Pi T r i (5) (Pi T AP i )B i = Pi T AR i+ (6) Proof. Enforcing orthogonality of residual and search directions in Theorem we have P T i r i+s = P T i (r i AP i a i ) = P T i r i (P T i AP i )a i = 0 (7) Also, enforcing A-orthogonality between search directions in Lemma we obtain P T i AP i+ = P T i A(R i+ + P i B i ) = P T i AR i+ + (P T i AP i )B i = 0 (8) Notice the similarity between Corollary and 2 for and s-step CG, respectively. Finally, putting all the equations () - (6) together we obtain the pseudo-code for s-step CG iterative method, shown in Alg. 2. Algorithm 2 S-Step CG : Let A R n n be a SPD matrix and x 0 be an initial guess. 2: Compute r 0 = f Ax 0 and let index k = is (initially k = 0) 3: for i = 0,,...until convergence do 4: Compute T = [r k, Ar k,..., A s r k ] Matrix-power kernel 5: Let R i = [r k, Ar k,..., A s r k ] Extract R = T (:, : s ) from T 6: Let Q i = [Ar k, A 2 r k,..., A s r k ] = AR i Extract Q = T (:, 2 : s) from T 7: if i == 0 then Find directions p 8: Set P i = R i 9: else 0: Compute C i = Q T i P i Block dot-products : Solve W i B i = C i Find scalars β 2: Compute P i = R i + P i B i 3: end if 4: Compute W i = Q T i P i Block dot-products 5: Compute g i = Pi T i 6: Solve W i a i = g i Find scalars α 7: Compute x k+s = x k + P i a i Compute new approximation x k 8: Compute r k+s = f Ax k+s Recompute residual r k 9: if r k+s / tol then stop end Check Convergence 20: end for 5

6 Let us now introduce preconditioning into the s-step CG in a manner similar to the algorithm. Let the preconditioned linear system be (L AL T )(L T x) = L f (9) Then, we can apply the s-step on the preconditioned system (9) and re-factor recurrences in terms of preconditioner M = LL T. If we express everything in terms of preconditioned directions M P i and work with the original solution vector x i we obtain the preconditioned s-step CG method, shown in Alg. 3. Notice that tall matrices Z i and Q i on lines 6 and 7 can be constructed during the computation of T on line 5 in Alg. 3. The key observation is that the computation of T proceeds in the following fashion r i, M r i, (AM )r i, M (AM )r i, (AM ) 2 r i,..., M (AM ) s r i, (AM ) s r i and therefore alternating odd and even terms define columns of tall matrices Z i and Q i. Algorithm 3 Preconditioned S-Step CG : Let A R n n be a SPD matrix and x 0 be an initial guess. 2: Let M = LL T be SPD preconditioner, so that M A. 3: Compute r 0 = f Ax 0, ˆr 0 = M r 0 and let index k = is (initially k = 0) 4: for i = 0,,...until convergence do 5: Compute T = M [r k, Âr k,..., Âs r k ] Matrix-power kernel 6: Let Z i = M [r k, Âr k,..., Âs r k ] = M R i Build Z & Q during 7: Let Q i = [Âr k, Â2 r k,..., Âs r k ] = AZ i computation of T 8: where auxiliary  = AM and R i = [r k, Âr k,..., Âs r k ] 9: if i == 0 then Find directions p 0: Set P i = Z i : else 2: Compute C i = Q T i P i Block dot-products 3: Solve W i B i = C i Find scalars β 4: Compute P i = Z i + P i B i 5: end if 6: Compute W i = Q T i P i Block dot-products 7: Compute g i = Pi T i 8: Solve W i a i = g i Find scalars α 9: Compute x k+s = x k + P i a i Compute new approximation x k 20: Compute r k+s = f Ax k+s Recompute residual r k 2: Compute ˆr k+s = M r k+s 22: if r k+s / tol then stop end Check Convergence 23: end for 6

7 Finally, notice that preconditioned s-step CG performs an extra matrix-vector multiplication with A and preconditioner solve with M per iteration, when compared with s iterations of the preconditioned CG method. The extra work happens during re-computation of residual at the end of every s-step iteration. 3 S-Step BiConjugate Gradient Stabilized (BiCGStab) Let us now focus on the s-step BiCGStab method, which has not been previously derived by A. T. Chronopoulos et al. The BiCGStab is given in Alg 4. Algorithm 4 Standard BiCGStab : Let A R n n be a nonsingular matrix and x 0 be an initial guess. 2: Compute r 0 = f Ax 0 and set p 0 = r 0 3: Let ř 0 be arbitrary (for example ř 0 = r 0 ) 4: for i = 0,,...until convergence do 5: Compute z i = Ap i Matrix-vector multiply 6: Compute α i = řt 0 r i ř T 0 z Dot-product i 7: Compute s i = r i α i z i Find directions s 8: if s i / tol then set x i+ = x i + α i p i and stop end Early exit 9: Compute v i = As i Matrix-vector multiply 0: Compute ω i = st i v i v T i v i : Compute x i+ = x i + α i p i + ω i s i Dot-product 2: Compute r i+ = s i ω i v i 3: if r i+ / r 0 ( 2 tol ) or ( early ) exit then stop end Check Convergence řt 4: Compute β i = 0 r i+ αi ř T 0 r i ω i 5: Compute p i+ = r i+ + β i (p i ω i z i ) Find directions p 6: end for Recall that the directions p i and residuals r i in BiCGStab have been derived from BiCG method using polynomial recurrences to avoid multiplication with transpose of the coefficient matrix A T [7]. The properties of directions and residuals in BiCG method are well known and can be easily used to setup the corresponding s-step method. The relationships between them in BiCGStab are far less clear. Therefore, the first step is to understand the properties of directions s i, p i and residuals r i, ř 0 in BiCGStab method. Theorem 2. The directions s i, p i and residuals r i, ř 0 satisfy the following relationships ř T 0 s i = 0 (20) r T i+as i = 0 (2) ř T 0 (p i+ β i p i ) = 0 (22) 7

8 Proof. First relationship (20) follows from ř T 0 s i = ř T 0 (r i α i Ap i ) = ř T 0 r i α i ř T 0 Ap i = ř T 0 r i ř T 0 r i = 0 (23) Second relationship (2) follows from r T i+as i = (s i ω i As i ) T As i = s T i As i ω i (As i ) T (As i ) = s T i As i s T i As i = 0 (24) The last relationship (22) follows from ř T 0 (p i+ β i p i ) = ř T 0 (r i+ β i ω i Ap i ) = (25) (řt ) = ř T 0 r i+ 0 r i+ ř T 0 Ap ř T 0 Ap i = 0 (26) i Notice that we can express the recurrences of BiCGStab in the following convenient form s i = r i α i Ap i (27) r i+ = (I ω i A)s i (28) q i = (I ω i A)p i (29) p i+ = r i+ + β i q i (30) Corollary 3. Let auxiliary vector y i = s i + β i p i, then directions p i+ = (I ω i A)y i and ř T 0 Ay i = 0 (3) as long as ω i 0. Proof. Notice that using recurrences (28), (29) and (30) we have p i+ = r i+ + β i q i = (I ω i A)s i + β i (I ω i A)p i = (I ω i A)(s i + β i p i ) (32) and using first and last relationships in Theorem 2 we have ř T 0 A(s i + β i p i ) = ř T 0 A( ω i s i β i ω i p ω i ) = ř T 0 (s i ω i As i β i ω i Ap i ω i ) i = ř T 0 (r i+ β i ω i Ap ω i ) = ř T 0 (p i ω i+ β i p i ) = 0 i (33) These properties are interesting by themselves, but it is difficult to rely only on them to setup a linear system for finding scalars α, ω and β similarly to CG and BiCG methods because they involve a single vector corresponding to the shadow residual ř 0, rather than all directions p in CG or shadow directions ˇp in BiCG. Therefore, we have to find additional conditions and a different way of setting up the s-step method. 8

9 Let us now assume that we have s directions p i and s i available at once, then we have x i+s = x i + α i p i α i+s p i+s + ω i s i ω i+s s i+s = x i + P i a i + S i o i (34) where P i = [p i,..., p i+s ] and S i = [s i,..., s i+s ] R n s are tall matrices and a i = [α i,..., α i+s ] T and o i = [ω i,..., ω i+s ] T R s are small vectors. Consequently, residual can be expressed as r i+s = f Ax i+s = r i A(P i a i + S i o i ) (35) Also, let us assume that we have the monomial basis for Krylov subspace for s iterations and monomial basis for directions R i = [r i, Ar i,..., A 2(s ) r i ] (36) P i = [p i, Ap i,..., A 2(s ) p i ] (37) Notice that using recurrences (27) - (30) we can write the following [s i, As i, q i, Aq i ] = [ r i, Ar i, A 2 r i, p i, Ap i, A 2 p i ] C (38) [ ri+, p i+ ] = [s i, As i, q i, Aq i ] D (39) where transitional matrices of scalar coefficients C and D are C = C () C (2) = 0 0 α i 0 α i = 0 0 α i ω i α i ω i ω i ω i (40) and D = D () D (2) = ω i β i 0 = ω i ω i β i (4) 0

10 Also, note that for B = CD and k 0 we have A k [r i+, p i+ ] = A k [r i, Ar i, A 2 r i, p i, Ap i, A 2 p i ]B (42) Therefore, letting B i R 4s 2 4s 6 follow the same pattern as B we have [r i+,..., A 2(s 2) r i+, p i+,..., A 2(s 2) p i+ ] = [r i,..., A 2(s ) r i, p i,..., A 2(s ) p i ]B i (43) and consequently [r i+s, p i+s ] = [r i,..., A 2(s ) r i, p i,..., A 2(s ) p i ]B i...b i+s 2 (44) Hence, we can use (44) to build directions s i, p i and residuals r i after s iterations, as long as we are able to compute the scalars α, ω and β used in transitional matrices. Let us now compute the matrix W R 4s 2 4s 2 and vector w R 4s 2 such that W i = [R i, P i ] T [R i, P i ] (45) w T i = ř T 0 [R i, P i ] = [g T i, h T i ] (46) and let W i (j, k) denote j-th row and k-th column element of matrix W i, while w i (j) denotes the j-th element of vector w i. Also, let us use -based indexing. Then, α i = řt 0 r i ř T 0 Ap = g i() (47) i h i (2) Further, after updating we have and, after updating we obtain Finally, and W i = C () T i Wi C () i (48) s T i ω i = As i (As i ) T (As i ) = W i (, 2) W i (2, 2) (49) w i = (C i D () i ) T w i (50) (řt ) ( ) ( ) β i = 0 r i+ αi w i ř T 0 r = () (αi ) i ω i g i () ω i (5) W i+ = (C (2) i D i ) T W i (C (2) i D i ) = B T i W i B i (52) w i+ = D (2) T i w i = Bi T w i (53) Finally, putting all the equations (34) - (53) together we obtain the pseudo-code for s-step BiCGStab iterative method, shown in Alg. 5. 0

11 Algorithm 5 S-Step BiCGStab : Let A R n n be a nonsingular coefficient matrix and let x 0 be an initial guess. 2: Compute r 0 = f Ax 0 and set p 0 = r 0 3: Let ř 0 be arbitrary (for example ř 0 = r 0 ) and let index k = is + j (initially k = 0) 4: for i = 0,,...until convergence do 5: Compute T = [[r k, p k ], A[r k, p k ],..., A 2s [r k, p k ]] Matrix-power kernel 6: Let R i = [r k, Ar k,..., A 2s r k ] = [ R i, A 2s r k ] Extract R from T 7: Let P i = [p k, Ap k,..., A 2s p k ] = [ P i, A 2s p k ] = [p k, P i ] Extract P from T 8: Compute W i = [R i, P i ] T [R i, P i ] Block dot-products 9: Compute w T i = ř T 0 [R i, P i ] = [g T i, ht i ] 0: Let. and. indicate all but last and first elements of., as shown for R i and P i : for j = 0,..., s do Note α k = řt 0 r k ř T 0 Ap k 3: Compute S k = R k α k Pk S k = [s k, As k,..., A 2(s j) s k ] 2: Set α k = g k () h k (2) and store a i(j) = α k T Wk C () k 4: Update W k = C() k 5: if W k (, )/ r 0 tol then break end Using s k = r k α k Ap k Check s k 2 for early exit 6: Set ω k = W k (,2) W k (2,2) and store o i(j) = ω k Note ω k = st k As k (As k ) T (As k ) 7: Compute Q k = P k ω k Pk Q k = [q k, Aq k,..., A 2(s j) q k ] 8: if (j == s ) then break; Check for last iteration 9: Compute R k+ = S k ω k Sk R k+ = [r k+,.., A 2(s j) 2 r k+ ] 20: Update w ( k = (C ) kd () k )T w k Using r k+ = s k ω k As k ( ( ) ( 2: Set β k = αk řt Note β k = 0 r k+ αk w k () g k () ω k ) 22: Compute P k+ = R k+ + β k Qk P k+ = [p k+,.., A 2(s j) 2 p k+ ] 23: Update W k+ = B T k W kb k Using p k+ = r k+ + β k q k 24: Update w k+ = Bk T w k 25: end for 26: Let P i = [p i,..., p k ] and S i = [s i,..., s k ] have the constructed directions 27: Compute x i+s = x i + P i a i + S i o i Do not use last s k if early exit 28: Compute r i+s = f Ax i+s Note k = is + s unless early exit 29: if r i+s / tol or early exit then stop end Check Convergence 30: Compute β k = ( řt 0 r i+s g k () ) ( ai (s ) o i (s ) 3: Compute p i+s = r i+s + β k q k 32: end for ) ř T 0 r k ω k ) Note α k = a i (s ) and ω k = o i (s )

12 It is worthwhile to mention again that updates using transitional matrices of scalar coefficients B i = C i D i can be expressed in terms of recurrences. In fact notice that matrices C (), C (2), D (), D (2) in equations (40) - (4) have one-to-one correspondence to recurrences in equations (27) - (30). We use a mix of transitional matrices and recurrences for clarity in the pseudo-code. However, we point out that latter allows for more efficient implementation of the algorithm in practice. Also, notice that at each inner iteration the size of tall matrices R i and P i as well as small square matrix W i and vector w i is reduced by 4 elements (columns for R i and P i, rows/columns for W i and vector elements for w i ). These 4 elements correspond to two extra powers of A and A 2 applied to residual r and directions p. Consequently, the updates can be done in-place preserving all previously computed residuals and directions without requiring extra memory storage for them. Let us now introduce preconditioning into the s-step BiCGStab in a manner similar to the algorithm. Let the preconditioned linear system be (AM )(Mx) = f (54) Then, we can apply the s-step on the preconditioned system (54) and re-factor recurrences in terms of preconditioner M = LU. If we express everything in terms of preconditioned directions M P i and M S i and work with the original solution vector x i we obtain the preconditioned s-step BiCGStab method, shown in Alg. 6. Notice that tall matrices R i, P i, V i and Z i on lines 7 0 can be constructed during the computation of T on line 6 in Alg. 6. The key observation is that the computation of T proceeds in the following fashion [r i, p i ], M [r i, p i ], (AM )[r i, p i ], M (AM )[r i, p i ], (AM ) 2s [r i, p i ], M (AM ) 2s [r i, p i ],... (AM ) 2s [r i, p i ] (55) and therefore terms in first and second column define tall matrices R i, P i, V i and Z i. It is important to point out that in preconditioned s-step BiCGStab we carry both R i, P i and V i, Z i through inner s iterations. The latter pair is needed to compute original x i+s on line 30 and then residual r i+s on line 3, while the former pair is needed to compute direction p k on line 35 in Alg. 6. Both residual r k and direction p k are needed to compute T in the next outer iteration. Notice that this is different from preconditioned s-step CG, where only preconditioned residual r k is needed for the next outer iteration. Finally, notice that preconditioned s-step BiCGStab performs an extra matrix-vector multiplication with A and preconditioner solve with M per iteration, when compared with s iterations of the preconditioned BiCGStab method. The extra work happens during re-computation of residual at the end of every s-step iteration. 2

13 Algorithm 6 Preconditioned S-Step BiCGStab : Let A R n n be a nonsingular coefficient matrix and let x 0 be an initial guess. 2: Let M = LU be the preconditioner, so that M A. 3: Compute r 0 = f Ax 0 and set p 0 = r 0 4: Let ř 0 be arbitrary (for example ř 0 = r 0 ) and let index k = is + j (initially k = 0) 5: for i = 0,,...until convergence do 6: Compute T = [[r k, p k ], Â[r k, p k ],..., Â2s [r k, p k ]] with  = AM Matrix-power kernel 7: Let R i = [r k, Âr k,..., Â2s r k ] = [ R i, Â2s r k ] Build R, P, V and Z 8: Let P i = [p k, Âp k,..., Â2s p k ] = [ P i, Â2s p k ] = [p k, P i ] during comput. of T 9: Let V i = M [r k, Âr k,..., Â2s r k ] = M Ri 0: Let Z i = M [p k, Âp k,..., Â2s p k ] = M Pi : Compute W i = [R i, P i ] T [R i, P i ] Block dot-products 2: Compute w T i = ř T 0 [V i, Z i ] = [g T i, ht i ] 3: Let. and. indicate all but last and first elements of., as shown for R i and P i 4: for j = 0,..., s do Note α k = řt 0 r k ř T 0 Ap k 6: Compute [S k, S k ] = [ R k, Ṽk] α k [ P k, Z k ] S k = [s k, As k,..., A 2(s j) s k ] 5: Set α k = g k () h k (2) and store a i(j) = α k T Wk C () k 7: Update W k = C() k 8: if W k (, )/ r 0 tol then break end Using s k = r k α k Ap k Check s k 2 for early exit 9: Set ω k = W k (,2) W k (2,2) and store o i(j) = ω k Note ω k = st k As k (As k ) T (As k ) 20: Compute [Q k, Q k ] = [ P k, Z k ] ω k [ P k, Z k ] Q k = [q k, Aq k,..., A 2(s j) q k ] 2: if (j == s ) then break; Check for last iteration 22: Compute [R k+, V k+ ] = [ S k, S k ] ω k[ S k, S k ] R k+ = [r k+,.., A 2(s j) 2 r k+ ] 23: Update w ( k = (C ) kd () k )T w k Using r k+ = s k ω k As k ( ( ) ( 24: Set β k = αk řt Note β k = 0 r k+ αk w k () g k () ω k ) 25: Compute [P k+, Z k+ ] = [R k+, V k+ ] + β k [ Q k, Q k ] P k+ = [p k+,.., A 2(s j) 2 p k+ ] 26: Update W k+ = B T k W kb k Using p k+ = r k+ + β k q k 27: Update w k+ = Bk T w k 28: end for 29: Let Z i = M [p i,..., p k ] and S i = M [s i,..., s k ] have the constructed directions 30: Compute x i+s = x i + Z i a i + S i o i Do not use last s k if early exit 3: Compute r i+s = f Ax i+s Note k = is + s unless early exit 32: if r i+s / tol or early exit then stop end Check Convergence 33: Compute r i+s = ( M r i+s 34: Compute β k = ř T 0 r i+s g k () ) ( ai (s ) o i (s ) 35: Compute ˆp i+s = ˆr i+s + β k q k 36: end for 3 ) ř T 0 r k ω k ) Note α k = a i (s ) and ω k = o i (s )

14 4 Numerical Experiments In this section we study the numerical behavior of the preconditioned s-step CG and BiCGStab iterative methods. In order to facilitate comparisons, we use the same matrices as previous investigations of their and block counterparts [3, 4]. The seven SPD and five nonsymmetric matrices from UFSMC [9] with the respective number of rows (m), columns (n=m) and non-zero elements (nnz) are grouped and shown according to their increasing order in Tab.. We compare the s-step iterative methods using a reference MATLAB implementation. We let the initial guess be zero and the RHS f = Ae where e = (,..., ) T. Also, unless stated otherwise we let the stopping criteria be the maximum number of iterations 40 or relative residual r i / < 0 2, where r i = f Ax i is the residual at i-th iteration. These modest stopping criteria allow us to illustrate the behavior of the algorithms without being significantly affected by the accumulation of floating point errors through iterations, which is addressed later on separately with different stopping criteria that match previous studies [3, 4]. # Matrix m,n nnz SPD Application. offshore 259,789 4,242,673 yes Geophysics 2. af shell3 504,855 7,562,05 yes Mechanics 3. parabolic fem 525,825 3,674,625 yes General 4. apache2 75,76 4,87,870 yes Mechanics 5. ecology2 999,999 4,995,99 yes Biology 6. thermal2,228,045 8,580,33 yes Thermal Simulation 7. G3 circuit,585,478 7,660,826 yes Circuit Simulation 8. FEM 3D thermal2 47,900 3,489,300 no Mechanics 9. thermomech dk 204,36 2,846,228 no Mechanics 0. ASIC 320ks 32,67,36,085 no Circuit Simulation. cage3 445,35 7,479,343 no Biology 2. atmosmodd,270,432 8,84,880 no Atmospheric Model. Table : Symmetric positive definite (SPD) and nonsymmetric test matrices First, let us illustrate that s-step iterative methods indeed produce the same approximation to the solution x i after i iterations as their counterparts after i s iterations. Notice that when we plot r i through iterations for scheme in a grey line and for s-step scheme with s = in blue circle, s = 2 in green star and s = 4 in orange diamond, these markings lie on the same curve and coincide in strides of s iterations, as can be seen in Fig. and 2. The only difference might happen in the last iteration where schemes can stop at an arbitrary iteration number, while s-step schemes must proceed to the next multiple of s. This indicates that both un- and preconditioned s-step CG and BiCGStab indeed produce the same solution x i as iterative methods. 4

15 r i s step (s=) s step (s=2) s step (s=4) r i s step (s=) s step (s=2) s step (s=4) (a) Conjugate Gradient (CG) (b) Preconditioned CG Figure : Plot of convergence of and s-step CG for s =, 2, 4 on af shell3 matrix r i s step (s=) s step (s=2) s step (s=4) r i s step (s=) s step (s=2) s step (s=4) (a) BiConjugate Gradient Stabilized (BiCGStab) (b) Preconditioned BiCGStab Figure 2: Plot of convergence of and s-step BiCGStab for s =, 2, 4 on atmosmmodd matrix Second, let us make an important observation about numerical stability of s-step methods. As we have mentioned earlier, the vectors in the monomial basis of the Krylov subspace used in s-step methods can become linearly dependent relatively quickly. In fact, if we plot r i through iterations for s = 8 in red square, we can easily identify several occasions where these markings do not coincide with the grey line used for the scheme, see Fig. 3 and 4. Notice that in the case of s-step CG on Fig. 3 (a) the s-step method diverged, while in the case of s-step BiCGStab on Fig 4 (a) the method exited early, because intermediate value s fell below a specified threshold. This behavior is in line with the observations made in [6] that for s > 5 s-step CG may suffer from loss of orthogonality and that preconditioned s-step CG would likely fair better than its unpreconditioned version. Third, let us look at the convergence of the preconditioned s-step iterative method for different stopping criteria. In particular, we are interested in what happens when we ask the algorithm to achieve a tighter tolerance r i / < 0 7 and allow a larger maximum number of iterations. Notice that the residuals computed by the s-step methods always start following the ones computed by algorithms, see Fig 5 and 6. In some cases, the s-step methods match their counterparts exactly until the solution is found, as shown for offshore matrix on Fig. 5 (a). In other cases, they follow the residuals approximately, while still finding the solution towards the end, as shown for atmosmodd 5

16 s step (s=8) s step (s=8) r i r i (a) Conjugate Gradient (CG) (b) Preconditioned CG Figure 3: Plot of convergence of and s-step CG for s = 8 on af shell3 matrix 0 7 s step (s=8) 0 6 s step (s=8) 0 6 r i r i (a) BiConjugate Gradient Stabilized (BiCGStab) (b) Preconditioned BiCGStab Figure 4: Plot of convergence of and s-step BiCGStab for s = 8 on atmosmmodd matrix matrix on Fig. 5 (b). However, there are also cases where the residuals computed by and s-step methods initially coincide, but the latter diverge from the solution towards the end, as shown for matrices af shell3 and apache2 on Fig. 6. These empirical results lead us to conclude that s-step methods are often better suited for linear systems, where we are looking for an approximate solution with a relatively relaxed tolerance, such as 0 2, and expect to find it in modest number of iterations, for example < 80 iterations. Also, we recall that s 5 is recommended for most cases. Finally, reverting to using our original stopping criteria that were stated on page 4, we plot log r i obtained by the preconditioned and s-step methods for all the matrices on Fig. 7. Notice that because we plot the negative of the log, the higher is the value on the plot in Fig. 7, the lower is the relative residual. Notice that for most matrices the value of the relative residual across all schemes is either the same or slightly better for s-step schemes, because the latter stop at iteration number that is multiple of s, which might be slightly higher than the iteration at which the algorithm has stopped. There are also a few cases, such as thermomech dk corresponding to matrix 9, where the relative residual for s-step method with s = 8 is worse, due to loss of orthogonality of the vectors in the monomial basis. 6

17 r i s step (s=) s step (s=2) s step (s=4) r i s step (s=) s step (s=2) s step (s=4) (a) offshore (b) atmosmodd Figure 5: Plot of convergence of preconditioned methods for tighter tolerance r i/ r 0 < 0 7 r i s step (s=) s step (s=2) s step (s=4) r i s step (s=) s step (s=2) s step (s=4) (a) af shell3 matrix (b) apache2 matrix Figure 6: Plot of convergence of preconditioned methods for tighter tolerance r i/ r 0 < 0 7 -log( r i / ) s= s=2 s=4 s=8 (higher is better) Matrix Figure 7: Plot of log r i obtained by preconditioned and s-step methods 7

18 The detailed results of the numerical experiments are shown in Tab. 2 and 3. # # it. Standard s-step (s=) s-step (s=2) s-step (s=4) s-step (s=8) r i # it. r i # it. r i # it. r i # it. r i E E E E E E E E E E E E E E E E E E E E E E E E E E E E E E E-03.70E-03.57E-03.37E E E E E E-03.09E E E E-0.78E-0 5 NaN E E E-02.69E-0.69E E E E E E E E E E E-02 Table 2: Results for and s-step unpreconditioned CG and BiCGStab methods # # it. Standard s-step (s=) s-step (s=2) s-step (s=4) s-step (s=8) r i # it. r i # it. r i # it. r i # it. r i E E E-03.23E-05.07E E E E E E E E E E E E E E E E E E E E E E E E E E E-04.90E-04.03E E E E E-03.29E-03.29E-03.29E E E E E E E E E E E E E-04.92E-05.92E-05.92E E E E E E-03 Table 3: Results for and s-step preconditioned CG and BiCGStab methods 8

19 5 Conclusion In this paper we made an overview of the existing s-step CG iterative method and developed a novel s-step BiCGStab iterative method. We have also introduced preconditioning to both s-step algorithms, maintaining the extra work performed at each s iterations to the extra matrix-vector multiplication and a preconditioning solve. We have discussed advantages of these methods, such as the use of matrix-power kernel and block dot-products, as well as their disadvantages, such as the loss of orthogonality of the monomial Krylov subspace basis and extra work performed per iteration. Also, we have pointed out the connection between s-step and communication-avoiding algorithms. Finally, we performed numerical experiments to validate the theory behind the s-step methods, and will look at their performance in the future. These algorithms can potentially be a better alternative to their counterparts when modest tolerance and number of iterations are required to solve a linear system on parallel platforms. Also, they could be used for a larger class of problems if the numerical stability issues related to the monomial basis are resolved. 6 Acknowledgements The author would like to acknowledge Erik Boman and Michael Garland for their useful comments and suggestions. References [] R. Barrett, M. Berry, T. F. Chan, J. Demmel, J. Donato, J. Dongarra, V. Eijkhout, R. Pozo, C. Romine, H. van der Vorst, Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods, SIAM, Philadelphia, PA, 994. [2] E. Carson, N. Knight and J. Demmel, Avoiding Communication in Nonsymmetric Lanczos-Based Krylov Subspace Methods, SIAM J. Sci. Comput., Vol. 35, pp. 42-6, 202. [3] E. Carson, Communication-Avoiding Krylov Subspace Methods in Theory and Practice, Ph.D. Thesis, University of California - Berkeley, 205. [4] A. T. Chronopoulos A Class of Parallel Iterative Methods on Multiprocessor, Ph.D. Thesis, University of Illinois - Urbana-Champagne, 987. [5] A. T. Chronopoulos, S-Step Iterative Methods for (Non)Symmetric (In)Definite Linear Systems, SIAM Journal on Numerical Analysis, Vol. 28, pp , 99. 9

20 [6] A. T. Chronopoulos and C. W. Gear, S-Step Iterative Methods for Symmetric Linear Systems, Journal of Computational and Applied Mathematics, Vol. 25, pp , 989. [7] A. T. Chronopoulos and C. D. Swanson, Parallel Iterative S-step Methods for Unsymmetric Linear Systems, Parallel Computing. Vol. 22, pp , 996. [8] A. T. Chronopoulos and S. K. Kim, S-Step Orthomin and GMRES Implemented on Parallel Computers, Technical Report UMSI 90/43R, University of Minnesota Supercomputing Institute, 990. [9] A. T. Chronopoulos and S. K. Kim, S-step Lanczos and Arnoldi Methods on Parallel Computers, Technical Report UMSI 90/4R, University of Minnesota Supercomputing Institute, Minneapolis, 990. [0] M. M. Dehnavi, J. Demmel, D. Giannacopoulos, Y. El-Kurdi, Communication-Avoiding Krylov Techniques on Graphics Processing Units, IEEE Trans. on Magnetics, Vol. 49, pp , 203. [] A. El Guennouni, K. Jbilou and H. Sadok, A Block Version of BiCGStab for Linear Systems with Multiple Right-Hand-Sides, ETNA, 6, pp (2003). [2] M. Hoemmen, Communication-Avoiding Krylov Subspace Methods, Ph.D. Thesis, University of California - Berkeley, 200. [3] M. Naumov, Parallel Incomplete-LU and Cholesky Factorization in the Preconditioned Iterative Methods on the GPU, Nvidia Technical Report, 3, 202. [4] M. Naumov, Preconditioned Block-Iterative Methods on GPUs, Proc. International Association of Applied Mathematics and Mechanics, PAMM, Vol. 2, pp. -4, 202. [5] D. P. O Leary, The Block Conjugate Gradient Algorithm and Related Methods, Linear Algebra Appl., Vol. 29, pp , 980. [6] J. van Rosendale, Minimizing Inner Product Data Dependence in Conjugate Gradient Iteration, Technical Report 7278, NASA Langley Research Center, 983. [7] Y. Saad, Iterative Methods for Sparse Linear Systems, SIAM, Philadelphia, PA, 2nd Ed., [8] I. Yamazaki, S. Rajamanickam, E. G. Boman, M. Hoemmen, M. A. Heroux and S. Tomov, Domain Decomposition Preconditioners for Communication-Avoiding Krylov Methods on Hybrid CPU/GPU Cluster, Proc. International Conference for High Performance Computing, Networking, Storage and Analysis, pp , 204. [9] The University of Florida Sparse Matrix Collection (UFSMC), 20

Incomplete-LU and Cholesky Preconditioned Iterative Methods Using CUSPARSE and CUBLAS

Incomplete-LU and Cholesky Preconditioned Iterative Methods Using CUSPARSE and CUBLAS Incomplete-LU and Cholesky Preconditioned Iterative Methods Using CUSPARSE and CUBLAS Maxim Naumov NVIDIA, 2701 San Tomas Expressway, Santa Clara, CA 95050 June 21, 2011 Abstract In this white paper we

More information

Iterative Methods for Linear Systems of Equations

Iterative Methods for Linear Systems of Equations Iterative Methods for Linear Systems of Equations Projection methods (3) ITMAN PhD-course DTU 20-10-08 till 24-10-08 Martin van Gijzen 1 Delft University of Technology Overview day 4 Bi-Lanczos method

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 18 Outline

More information

Contents. Preface... xi. Introduction...

Contents. Preface... xi. Introduction... Contents Preface... xi Introduction... xv Chapter 1. Computer Architectures... 1 1.1. Different types of parallelism... 1 1.1.1. Overlap, concurrency and parallelism... 1 1.1.2. Temporal and spatial parallelism

More information

9.1 Preconditioned Krylov Subspace Methods

9.1 Preconditioned Krylov Subspace Methods Chapter 9 PRECONDITIONING 9.1 Preconditioned Krylov Subspace Methods 9.2 Preconditioned Conjugate Gradient 9.3 Preconditioned Generalized Minimal Residual 9.4 Relaxation Method Preconditioners 9.5 Incomplete

More information

FEM and sparse linear system solving

FEM and sparse linear system solving FEM & sparse linear system solving, Lecture 9, Nov 19, 2017 1/36 Lecture 9, Nov 17, 2017: Krylov space methods http://people.inf.ethz.ch/arbenz/fem17 Peter Arbenz Computer Science Department, ETH Zürich

More information

Finite-choice algorithm optimization in Conjugate Gradients

Finite-choice algorithm optimization in Conjugate Gradients Finite-choice algorithm optimization in Conjugate Gradients Jack Dongarra and Victor Eijkhout January 2003 Abstract We present computational aspects of mathematically equivalent implementations of the

More information

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294) Conjugate gradient method Descent method Hestenes, Stiefel 1952 For A N N SPD In exact arithmetic, solves in N steps In real arithmetic No guaranteed stopping Often converges in many fewer than N steps

More information

Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method

Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method Leslie Foster 11-5-2012 We will discuss the FOM (full orthogonalization method), CG,

More information

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems Topics The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems What about non-spd systems? Methods requiring small history Methods requiring large history Summary of solvers 1 / 52 Conjugate

More information

M.A. Botchev. September 5, 2014

M.A. Botchev. September 5, 2014 Rome-Moscow school of Matrix Methods and Applied Linear Algebra 2014 A short introduction to Krylov subspaces for linear systems, matrix functions and inexact Newton methods. Plan and exercises. M.A. Botchev

More information

Iterative methods for Linear System

Iterative methods for Linear System Iterative methods for Linear System JASS 2009 Student: Rishi Patil Advisor: Prof. Thomas Huckle Outline Basics: Matrices and their properties Eigenvalues, Condition Number Iterative Methods Direct and

More information

Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method

Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method Algorithm for Sparse Approximate Inverse Preconditioners in the Conjugate Gradient Method Ilya B. Labutin A.A. Trofimuk Institute of Petroleum Geology and Geophysics SB RAS, 3, acad. Koptyug Ave., Novosibirsk

More information

Chapter 7 Iterative Techniques in Matrix Algebra

Chapter 7 Iterative Techniques in Matrix Algebra Chapter 7 Iterative Techniques in Matrix Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128B Numerical Analysis Vector Norms Definition

More information

Efficient Deflation for Communication-Avoiding Krylov Subspace Methods

Efficient Deflation for Communication-Avoiding Krylov Subspace Methods Efficient Deflation for Communication-Avoiding Krylov Subspace Methods Erin Carson Nicholas Knight, James Demmel Univ. of California, Berkeley Monday, June 24, NASCA 2013, Calais, France Overview We derive

More information

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A. AMSC/CMSC 661 Scientific Computing II Spring 2005 Solution of Sparse Linear Systems Part 2: Iterative methods Dianne P. O Leary c 2005 Solving Sparse Linear Systems: Iterative methods The plan: Iterative

More information

Linear Solvers. Andrew Hazel

Linear Solvers. Andrew Hazel Linear Solvers Andrew Hazel Introduction Thus far we have talked about the formulation and discretisation of physical problems...... and stopped when we got to a discrete linear system of equations. Introduction

More information

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009) Iterative methods for Linear System of Equations Joint Advanced Student School (JASS-2009) Course #2: Numerical Simulation - from Models to Software Introduction In numerical simulation, Partial Differential

More information

Key words. conjugate gradients, normwise backward error, incremental norm estimation.

Key words. conjugate gradients, normwise backward error, incremental norm estimation. Proceedings of ALGORITMY 2016 pp. 323 332 ON ERROR ESTIMATION IN THE CONJUGATE GRADIENT METHOD: NORMWISE BACKWARD ERROR PETR TICHÝ Abstract. Using an idea of Duff and Vömel [BIT, 42 (2002), pp. 300 322

More information

1 Conjugate gradients

1 Conjugate gradients Notes for 2016-11-18 1 Conjugate gradients We now turn to the method of conjugate gradients (CG), perhaps the best known of the Krylov subspace solvers. The CG iteration can be characterized as the iteration

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT 16-02 The Induced Dimension Reduction method applied to convection-diffusion-reaction problems R. Astudillo and M. B. van Gijzen ISSN 1389-6520 Reports of the Delft

More information

IDR(s) Master s thesis Goushani Kisoensingh. Supervisor: Gerard L.G. Sleijpen Department of Mathematics Universiteit Utrecht

IDR(s) Master s thesis Goushani Kisoensingh. Supervisor: Gerard L.G. Sleijpen Department of Mathematics Universiteit Utrecht IDR(s) Master s thesis Goushani Kisoensingh Supervisor: Gerard L.G. Sleijpen Department of Mathematics Universiteit Utrecht Contents 1 Introduction 2 2 The background of Bi-CGSTAB 3 3 IDR(s) 4 3.1 IDR.............................................

More information

Iterative Methods for Solving A x = b

Iterative Methods for Solving A x = b Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http

More information

Iterative Methods for Sparse Linear Systems

Iterative Methods for Sparse Linear Systems Iterative Methods for Sparse Linear Systems Luca Bergamaschi e-mail: berga@dmsa.unipd.it - http://www.dmsa.unipd.it/ berga Department of Mathematical Methods and Models for Scientific Applications University

More information

6.4 Krylov Subspaces and Conjugate Gradients

6.4 Krylov Subspaces and Conjugate Gradients 6.4 Krylov Subspaces and Conjugate Gradients Our original equation is Ax = b. The preconditioned equation is P Ax = P b. When we write P, we never intend that an inverse will be explicitly computed. P

More information

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc. Lecture 11: CMSC 878R/AMSC698R Iterative Methods An introduction Outline Direct Solution of Linear Systems Inverse, LU decomposition, Cholesky, SVD, etc. Iterative methods for linear systems Why? Matrix

More information

Variants of BiCGSafe method using shadow three-term recurrence

Variants of BiCGSafe method using shadow three-term recurrence Variants of BiCGSafe method using shadow three-term recurrence Seiji Fujino, Takashi Sekimoto Research Institute for Information Technology, Kyushu University Ryoyu Systems Co. Ltd. (Kyushu University)

More information

Iterative Methods and Multigrid

Iterative Methods and Multigrid Iterative Methods and Multigrid Part 3: Preconditioning 2 Eric de Sturler Preconditioning The general idea behind preconditioning is that convergence of some method for the linear system Ax = b can be

More information

ITERATIVE METHODS BASED ON KRYLOV SUBSPACES

ITERATIVE METHODS BASED ON KRYLOV SUBSPACES ITERATIVE METHODS BASED ON KRYLOV SUBSPACES LONG CHEN We shall present iterative methods for solving linear algebraic equation Au = b based on Krylov subspaces We derive conjugate gradient (CG) method

More information

A Novel Approach for Solving the Power Flow Equations

A Novel Approach for Solving the Power Flow Equations Vol.1, Issue.2, pp-364-370 ISSN: 2249-6645 A Novel Approach for Solving the Power Flow Equations Anwesh Chowdary 1, Dr. G. MadhusudhanaRao 2 1 Dept.of.EEE, KL University-Guntur, Andhra Pradesh, INDIA 2

More information

Krylov Space Solvers

Krylov Space Solvers Seminar for Applied Mathematics ETH Zurich International Symposium on Frontiers of Computational Science Nagoya, 12/13 Dec. 2005 Sparse Matrices Large sparse linear systems of equations or large sparse

More information

A Residual Replacement Strategy for Improving the Maximum Attainable Accuracy of s-step Krylov Subspace Methods

A Residual Replacement Strategy for Improving the Maximum Attainable Accuracy of s-step Krylov Subspace Methods A Residual Replacement Strategy for Improving the Maximum Attainable Accuracy of s-step Krylov Subspace Methods Erin Carson James Demmel Electrical Engineering and Computer Sciences University of California

More information

On the Preconditioning of the Block Tridiagonal Linear System of Equations

On the Preconditioning of the Block Tridiagonal Linear System of Equations On the Preconditioning of the Block Tridiagonal Linear System of Equations Davod Khojasteh Salkuyeh Department of Mathematics, University of Mohaghegh Ardabili, PO Box 179, Ardabil, Iran E-mail: khojaste@umaacir

More information

Reduced Synchronization Overhead on. December 3, Abstract. The standard formulation of the conjugate gradient algorithm involves

Reduced Synchronization Overhead on. December 3, Abstract. The standard formulation of the conjugate gradient algorithm involves Lapack Working Note 56 Conjugate Gradient Algorithms with Reduced Synchronization Overhead on Distributed Memory Multiprocessors E. F. D'Azevedo y, V.L. Eijkhout z, C. H. Romine y December 3, 1999 Abstract

More information

Formulation of a Preconditioned Algorithm for the Conjugate Gradient Squared Method in Accordance with Its Logical Structure

Formulation of a Preconditioned Algorithm for the Conjugate Gradient Squared Method in Accordance with Its Logical Structure Applied Mathematics 215 6 1389-146 Published Online July 215 in SciRes http://wwwscirporg/journal/am http://dxdoiorg/14236/am21568131 Formulation of a Preconditioned Algorithm for the Conjugate Gradient

More information

Computational Linear Algebra

Computational Linear Algebra Computational Linear Algebra PD Dr. rer. nat. habil. Ralf Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2017/18 Part 3: Iterative Methods PD

More information

The Lanczos and conjugate gradient algorithms

The Lanczos and conjugate gradient algorithms The Lanczos and conjugate gradient algorithms Gérard MEURANT October, 2008 1 The Lanczos algorithm 2 The Lanczos algorithm in finite precision 3 The nonsymmetric Lanczos algorithm 4 The Golub Kahan bidiagonalization

More information

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH V. FABER, J. LIESEN, AND P. TICHÝ Abstract. Numerous algorithms in numerical linear algebra are based on the reduction of a given matrix

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 19: Computing the SVD; Sparse Linear Systems Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical

More information

ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS

ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS YOUSEF SAAD University of Minnesota PWS PUBLISHING COMPANY I(T)P An International Thomson Publishing Company BOSTON ALBANY BONN CINCINNATI DETROIT LONDON MADRID

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 9 Minimizing Residual CG

More information

Lecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm

Lecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm CS 622 Data-Sparse Matrix Computations September 19, 217 Lecture 9: Krylov Subspace Methods Lecturer: Anil Damle Scribes: David Eriksson, Marc Aurele Gilles, Ariah Klages-Mundt, Sophia Novitzky 1 Introduction

More information

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit V: Eigenvalue Problems Lecturer: Dr. David Knezevic Unit V: Eigenvalue Problems Chapter V.4: Krylov Subspace Methods 2 / 51 Krylov Subspace Methods In this chapter we give

More information

Some minimization problems

Some minimization problems Week 13: Wednesday, Nov 14 Some minimization problems Last time, we sketched the following two-step strategy for approximating the solution to linear systems via Krylov subspaces: 1. Build a sequence of

More information

MARCH 24-27, 2014 SAN JOSE, CA

MARCH 24-27, 2014 SAN JOSE, CA MARCH 24-27, 2014 SAN JOSE, CA Sparse HPC on modern architectures Important scientific applications rely on sparse linear algebra HPCG a new benchmark proposal to complement Top500 (HPL) To solve A x =

More information

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES 48 Arnoldi Iteration, Krylov Subspaces and GMRES We start with the problem of using a similarity transformation to convert an n n matrix A to upper Hessenberg form H, ie, A = QHQ, (30) with an appropriate

More information

A SPARSE APPROXIMATE INVERSE PRECONDITIONER FOR NONSYMMETRIC LINEAR SYSTEMS

A SPARSE APPROXIMATE INVERSE PRECONDITIONER FOR NONSYMMETRIC LINEAR SYSTEMS INTERNATIONAL JOURNAL OF NUMERICAL ANALYSIS AND MODELING, SERIES B Volume 5, Number 1-2, Pages 21 30 c 2014 Institute for Scientific Computing and Information A SPARSE APPROXIMATE INVERSE PRECONDITIONER

More information

Incomplete Cholesky preconditioners that exploit the low-rank property

Incomplete Cholesky preconditioners that exploit the low-rank property anapov@ulb.ac.be ; http://homepages.ulb.ac.be/ anapov/ 1 / 35 Incomplete Cholesky preconditioners that exploit the low-rank property (theory and practice) Artem Napov Service de Métrologie Nucléaire, Université

More information

A short course on: Preconditioned Krylov subspace methods. Yousef Saad University of Minnesota Dept. of Computer Science and Engineering

A short course on: Preconditioned Krylov subspace methods. Yousef Saad University of Minnesota Dept. of Computer Science and Engineering A short course on: Preconditioned Krylov subspace methods Yousef Saad University of Minnesota Dept. of Computer Science and Engineering Universite du Littoral, Jan 19-3, 25 Outline Part 1 Introd., discretization

More information

MS 28: Scalable Communication-Avoiding and -Hiding Krylov Subspace Methods I

MS 28: Scalable Communication-Avoiding and -Hiding Krylov Subspace Methods I MS 28: Scalable Communication-Avoiding and -Hiding Krylov Subspace Methods I Organizers: Siegfried Cools, University of Antwerp, Belgium Erin C. Carson, New York University, USA 10:50-11:10 High Performance

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 20 1 / 20 Overview

More information

Performance Evaluation of GPBiCGSafe Method without Reverse-Ordered Recurrence for Realistic Problems

Performance Evaluation of GPBiCGSafe Method without Reverse-Ordered Recurrence for Realistic Problems Performance Evaluation of GPBiCGSafe Method without Reverse-Ordered Recurrence for Realistic Problems Seiji Fujino, Takashi Sekimoto Abstract GPBiCG method is an attractive iterative method for the solution

More information

1 Extrapolation: A Hint of Things to Come

1 Extrapolation: A Hint of Things to Come Notes for 2017-03-24 1 Extrapolation: A Hint of Things to Come Stationary iterations are simple. Methods like Jacobi or Gauss-Seidel are easy to program, and it s (relatively) easy to analyze their convergence.

More information

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU Preconditioning Techniques for Solving Large Sparse Linear Systems Arnold Reusken Institut für Geometrie und Praktische Mathematik RWTH-Aachen OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative

More information

Krylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17

Krylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17 Krylov Space Methods Nonstationary sounds good Radu Trîmbiţaş Babeş-Bolyai University Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17 Introduction These methods are used both to solve

More information

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA 1 SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA 2 OUTLINE Sparse matrix storage format Basic factorization

More information

Robust Preconditioned Conjugate Gradient for the GPU and Parallel Implementations

Robust Preconditioned Conjugate Gradient for the GPU and Parallel Implementations Robust Preconditioned Conjugate Gradient for the GPU and Parallel Implementations Rohit Gupta, Martin van Gijzen, Kees Vuik GPU Technology Conference 2012, San Jose CA. GPU Technology Conference 2012,

More information

A Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation

A Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation A Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation Tao Zhao 1, Feng-Nan Hwang 2 and Xiao-Chuan Cai 3 Abstract In this paper, we develop an overlapping domain decomposition

More information

OUTLINE 1. Introduction 1.1 Notation 1.2 Special matrices 2. Gaussian Elimination 2.1 Vector and matrix norms 2.2 Finite precision arithmetic 2.3 Fact

OUTLINE 1. Introduction 1.1 Notation 1.2 Special matrices 2. Gaussian Elimination 2.1 Vector and matrix norms 2.2 Finite precision arithmetic 2.3 Fact Computational Linear Algebra Course: (MATH: 6800, CSCI: 6800) Semester: Fall 1998 Instructors: { Joseph E. Flaherty, aherje@cs.rpi.edu { Franklin T. Luk, luk@cs.rpi.edu { Wesley Turner, turnerw@cs.rpi.edu

More information

Dense Matrices for Biofluids Applications

Dense Matrices for Biofluids Applications Dense Matrices for Biofluids Applications by Liwei Chen A Project Report Submitted to the Faculty of the WORCESTER POLYTECHNIC INSTITUTE In partial fulfillment of the requirements for the Degree of Master

More information

Iterative solution methods and their rate of convergence

Iterative solution methods and their rate of convergence Uppsala University Graduate School in Mathematics and Computing Institute for Information Technology Numerical Linear Algebra FMB and MN Fall 2007 Mandatory Assignment 3a: Iterative solution methods and

More information

ETNA Kent State University

ETNA Kent State University Z Electronic Transactions on Numerical Analysis. olume 16, pp. 191, 003. Copyright 003,. ISSN 10689613. etnamcs.kent.edu A BLOCK ERSION O BICGSTAB OR LINEAR SYSTEMS WITH MULTIPLE RIGHTHAND SIDES A. EL

More information

RESIDUAL SMOOTHING AND PEAK/PLATEAU BEHAVIOR IN KRYLOV SUBSPACE METHODS

RESIDUAL SMOOTHING AND PEAK/PLATEAU BEHAVIOR IN KRYLOV SUBSPACE METHODS RESIDUAL SMOOTHING AND PEAK/PLATEAU BEHAVIOR IN KRYLOV SUBSPACE METHODS HOMER F. WALKER Abstract. Recent results on residual smoothing are reviewed, and it is observed that certain of these are equivalent

More information

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication. CME342 Parallel Methods in Numerical Analysis Matrix Computation: Iterative Methods II Outline: CG & its parallelization. Sparse Matrix-vector Multiplication. 1 Basic iterative methods: Ax = b r = b Ax

More information

Solving Sparse Linear Systems: Iterative methods

Solving Sparse Linear Systems: Iterative methods Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/sccs Lecture Notes for Unit VII Sparse Matrix Computations Part 2: Iterative Methods Dianne P. O Leary c 2008,2010

More information

Solving Sparse Linear Systems: Iterative methods

Solving Sparse Linear Systems: Iterative methods Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/sccswebpage Lecture Notes for Unit VII Sparse Matrix Computations Part 2: Iterative Methods Dianne P. O Leary

More information

Key words. linear equations, polynomial preconditioning, nonsymmetric Lanczos, BiCGStab, IDR

Key words. linear equations, polynomial preconditioning, nonsymmetric Lanczos, BiCGStab, IDR POLYNOMIAL PRECONDITIONED BICGSTAB AND IDR JENNIFER A. LOE AND RONALD B. MORGAN Abstract. Polynomial preconditioning is applied to the nonsymmetric Lanczos methods BiCGStab and IDR for solving large nonsymmetric

More information

Preface to the Second Edition. Preface to the First Edition

Preface to the Second Edition. Preface to the First Edition n page v Preface to the Second Edition Preface to the First Edition xiii xvii 1 Background in Linear Algebra 1 1.1 Matrices................................. 1 1.2 Square Matrices and Eigenvalues....................

More information

Communication-avoiding Krylov subspace methods

Communication-avoiding Krylov subspace methods Motivation Communication-avoiding Krylov subspace methods Mark mhoemmen@cs.berkeley.edu University of California Berkeley EECS MS Numerical Libraries Group visit: 28 April 2008 Overview Motivation Current

More information

The Conjugate Gradient Method

The Conjugate Gradient Method The Conjugate Gradient Method Classical Iterations We have a problem, We assume that the matrix comes from a discretization of a PDE. The best and most popular model problem is, The matrix will be as large

More information

Lecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University

Lecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University Lecture Note 7: Iterative methods for solving linear systems Xiaoqun Zhang Shanghai Jiao Tong University Last updated: December 24, 2014 1.1 Review on linear algebra Norms of vectors and matrices vector

More information

Course Notes: Week 1

Course Notes: Week 1 Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues

More information

Tikhonov Regularization of Large Symmetric Problems

Tikhonov Regularization of Large Symmetric Problems NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. 2000; 00:1 11 [Version: 2000/03/22 v1.0] Tihonov Regularization of Large Symmetric Problems D. Calvetti 1, L. Reichel 2 and A. Shuibi

More information

Preconditioning Techniques Analysis for CG Method

Preconditioning Techniques Analysis for CG Method Preconditioning Techniques Analysis for CG Method Huaguang Song Department of Computer Science University of California, Davis hso@ucdavis.edu Abstract Matrix computation issue for solve linear system

More information

the method of steepest descent

the method of steepest descent MATH 3511 Spring 2018 the method of steepest descent http://www.phys.uconn.edu/ rozman/courses/m3511_18s/ Last modified: February 6, 2018 Abstract The Steepest Descent is an iterative method for solving

More information

Solving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners

Solving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners Solving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners Eugene Vecharynski 1 Andrew Knyazev 2 1 Department of Computer Science and Engineering University of Minnesota 2 Department

More information

Jae Heon Yun and Yu Du Han

Jae Heon Yun and Yu Du Han Bull. Korean Math. Soc. 39 (2002), No. 3, pp. 495 509 MODIFIED INCOMPLETE CHOLESKY FACTORIZATION PRECONDITIONERS FOR A SYMMETRIC POSITIVE DEFINITE MATRIX Jae Heon Yun and Yu Du Han Abstract. We propose

More information

Conjugate Gradient Method

Conjugate Gradient Method Conjugate Gradient Method direct and indirect methods positive definite linear systems Krylov sequence spectral analysis of Krylov sequence preconditioning Prof. S. Boyd, EE364b, Stanford University Three

More information

Numerical Methods - Numerical Linear Algebra

Numerical Methods - Numerical Linear Algebra Numerical Methods - Numerical Linear Algebra Y. K. Goh Universiti Tunku Abdul Rahman 2013 Y. K. Goh (UTAR) Numerical Methods - Numerical Linear Algebra I 2013 1 / 62 Outline 1 Motivation 2 Solving Linear

More information

Chapter 7. Iterative methods for large sparse linear systems. 7.1 Sparse matrix algebra. Large sparse matrices

Chapter 7. Iterative methods for large sparse linear systems. 7.1 Sparse matrix algebra. Large sparse matrices Chapter 7 Iterative methods for large sparse linear systems In this chapter we revisit the problem of solving linear systems of equations, but now in the context of large sparse systems. The price to pay

More information

Linear algebra issues in Interior Point methods for bound-constrained least-squares problems

Linear algebra issues in Interior Point methods for bound-constrained least-squares problems Linear algebra issues in Interior Point methods for bound-constrained least-squares problems Stefania Bellavia Dipartimento di Energetica S. Stecco Università degli Studi di Firenze Joint work with Jacek

More information

Fine-Grained Parallel Algorithms for Incomplete Factorization Preconditioning

Fine-Grained Parallel Algorithms for Incomplete Factorization Preconditioning Fine-Grained Parallel Algorithms for Incomplete Factorization Preconditioning Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology, USA SPPEXA Symposium TU München,

More information

Incomplete LU Preconditioning and Error Compensation Strategies for Sparse Matrices

Incomplete LU Preconditioning and Error Compensation Strategies for Sparse Matrices Incomplete LU Preconditioning and Error Compensation Strategies for Sparse Matrices Eun-Joo Lee Department of Computer Science, East Stroudsburg University of Pennsylvania, 327 Science and Technology Center,

More information

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.

More information

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/sccswebpage Lecture Notes for Unit VII Sparse Matrix Computations Part 1: Direct Methods Dianne P. O Leary c 2008

More information

Computational Linear Algebra

Computational Linear Algebra Computational Linear Algebra PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2018/19 Part 4: Iterative Methods PD

More information

High performance Krylov subspace method variants

High performance Krylov subspace method variants High performance Krylov subspace method variants and their behavior in finite precision Erin Carson New York University HPCSE17, May 24, 2017 Collaborators Miroslav Rozložník Institute of Computer Science,

More information

Comparison of Fixed Point Methods and Krylov Subspace Methods Solving Convection-Diffusion Equations

Comparison of Fixed Point Methods and Krylov Subspace Methods Solving Convection-Diffusion Equations American Journal of Computational Mathematics, 5, 5, 3-6 Published Online June 5 in SciRes. http://www.scirp.org/journal/ajcm http://dx.doi.org/.436/ajcm.5.5 Comparison of Fixed Point Methods and Krylov

More information

Lecture 17 Methods for System of Linear Equations: Part 2. Songting Luo. Department of Mathematics Iowa State University

Lecture 17 Methods for System of Linear Equations: Part 2. Songting Luo. Department of Mathematics Iowa State University Lecture 17 Methods for System of Linear Equations: Part 2 Songting Luo Department of Mathematics Iowa State University MATH 481 Numerical Methods for Differential Equations Songting Luo ( Department of

More information

ON THE GLOBAL KRYLOV SUBSPACE METHODS FOR SOLVING GENERAL COUPLED MATRIX EQUATIONS

ON THE GLOBAL KRYLOV SUBSPACE METHODS FOR SOLVING GENERAL COUPLED MATRIX EQUATIONS ON THE GLOBAL KRYLOV SUBSPACE METHODS FOR SOLVING GENERAL COUPLED MATRIX EQUATIONS Fatemeh Panjeh Ali Beik and Davod Khojasteh Salkuyeh, Department of Mathematics, Vali-e-Asr University of Rafsanjan, Rafsanjan,

More information

APPLIED NUMERICAL LINEAR ALGEBRA

APPLIED NUMERICAL LINEAR ALGEBRA APPLIED NUMERICAL LINEAR ALGEBRA James W. Demmel University of California Berkeley, California Society for Industrial and Applied Mathematics Philadelphia Contents Preface 1 Introduction 1 1.1 Basic Notation

More information

The quadratic eigenvalue problem (QEP) is to find scalars λ and nonzero vectors u satisfying

The quadratic eigenvalue problem (QEP) is to find scalars λ and nonzero vectors u satisfying I.2 Quadratic Eigenvalue Problems 1 Introduction The quadratic eigenvalue problem QEP is to find scalars λ and nonzero vectors u satisfying where Qλx = 0, 1.1 Qλ = λ 2 M + λd + K, M, D and K are given

More information

EXASCALE COMPUTING. Hiding global synchronization latency in the preconditioned Conjugate Gradient algorithm. P. Ghysels W.

EXASCALE COMPUTING. Hiding global synchronization latency in the preconditioned Conjugate Gradient algorithm. P. Ghysels W. ExaScience Lab Intel Labs Europe EXASCALE COMPUTING Hiding global synchronization latency in the preconditioned Conjugate Gradient algorithm P. Ghysels W. Vanroose Report 12.2012.1, December 2012 This

More information

A Method for Constructing Diagonally Dominant Preconditioners based on Jacobi Rotations

A Method for Constructing Diagonally Dominant Preconditioners based on Jacobi Rotations A Method for Constructing Diagonally Dominant Preconditioners based on Jacobi Rotations Jin Yun Yuan Plamen Y. Yalamov Abstract A method is presented to make a given matrix strictly diagonally dominant

More information

In order to solve the linear system KL M N when K is nonsymmetric, we can solve the equivalent system

In order to solve the linear system KL M N when K is nonsymmetric, we can solve the equivalent system !"#$% "&!#' (%)!#" *# %)%(! #! %)!#" +, %"!"#$ %*&%! $#&*! *# %)%! -. -/ 0 -. 12 "**3! * $!#%+,!2!#% 44" #% &#33 # 4"!#" "%! "5"#!!#6 -. - #% " 7% "3#!#3! - + 87&2! * $!#% 44" ) 3( $! # % %#!!#%+ 9332!

More information

Preconditioned inverse iteration and shift-invert Arnoldi method

Preconditioned inverse iteration and shift-invert Arnoldi method Preconditioned inverse iteration and shift-invert Arnoldi method Melina Freitag Department of Mathematical Sciences University of Bath CSC Seminar Max-Planck-Institute for Dynamics of Complex Technical

More information

Fast iterative solvers

Fast iterative solvers Utrecht, 15 november 2017 Fast iterative solvers Gerard Sleijpen Department of Mathematics http://www.staff.science.uu.nl/ sleij101/ Review. To simplify, take x 0 = 0, assume b 2 = 1. Solving Ax = b for

More information

A Block Compression Algorithm for Computing Preconditioners

A Block Compression Algorithm for Computing Preconditioners A Block Compression Algorithm for Computing Preconditioners J Cerdán, J Marín, and J Mas Abstract To implement efficiently algorithms for the solution of large systems of linear equations in modern computer

More information

Block Krylov Space Solvers: a Survey

Block Krylov Space Solvers: a Survey Seminar for Applied Mathematics ETH Zurich Nagoya University 8 Dec. 2005 Partly joint work with Thomas Schmelzer, Oxford University Systems with multiple RHSs Given is a nonsingular linear system with

More information

Lecture 17: Iterative Methods and Sparse Linear Algebra

Lecture 17: Iterative Methods and Sparse Linear Algebra Lecture 17: Iterative Methods and Sparse Linear Algebra David Bindel 25 Mar 2014 Logistics HW 3 extended to Wednesday after break HW 4 should come out Monday after break Still need project description

More information