E. de Sturler 1. Delft University of Technology. Abstract. introduced by Vuik and Van der Vorst. Similar methods have been proposed by Axelsson

Size: px
Start display at page:

Download "E. de Sturler 1. Delft University of Technology. Abstract. introduced by Vuik and Van der Vorst. Similar methods have been proposed by Axelsson"

Transcription

1 Nested Krylov Methods Based on GCR E. de Sturler 1 Faculty of Technical Mathematics and Informatics Delft University of Technology Mekelweg 4 Delft, The Netherlands Abstract Recently the GMRESR method for the solution of linear systems of equations has been introduced by Vuik and Van der Vorst. Similar methods have been proposed by Axelsson and Vassilevski [1] and Saad (FGMRES 2 ) [13]. GMRESR involves an outer and an inner method. The outer method is GCR, which is used to compute the optimal approximation over a given set of direction vectors in the sense that the residual is minimized. The inner method is GMRES, which at each step computes a new direction vector by approximately solving the residual equation. This direction vector is then used by the outer algorithm to compute a new approximation. However, the optimality of the approximation over the space of search directions is ignored in the inner GMRES algorithm. This leads to suboptimal corrections to the solution in the outer algorithm. Therefore we propose to preserve the orthogonality relations of GCR in the inner GMRES algorithm. This gives optimal corrections to the solution and also leads to solving the residual equation in a smaller subspace and with an `improved' operator, which should also lead to faster convergence. However, this involves using Krylov methods with a singular, non-symmetric operator. We will discuss some important properties of this. We will show by experiments, that in terms of matrix vector multiplications this modication (almost) always leads to better convergence. Because we do more orthogonalizations, it does not always give an improved performance in time. This depends on the cost of the matrix vector multiplication relative to the cost of orthogonalizations. Of course we can also use other methods than GMRES as the inner method, especially methods with short recurrences like BICGSTAB seem interesting. The experimental results indicate, that especially for such methods it is advantageous to preserve the orthogonality in the inner method. Key words. non-symmetric linear systems, iterative solvers, Krylov subspace, GCR, GMRES, GMRESR, BICGSTAB. AMS(mos) subject classication. 65F10. 1 Introduction For the solution of systems of linear equations the so-called Krylov subspace methods are very popular. However for general matrices no Krylov method can satisfy a global optimality requirement and have short recurrences [8]. Therefore either restarted or truncated versions of optimal methods, like GMRES [14], are used or methods with short recurrences, which do not satisfy a global optimality requirement, like BiCG [9], BICGSTAB [17], BICGSTAB(l) [15], CGS [16] 1 The author wishes to acknowledge Shell Research B.V. and STIPT for the nancial support of his research. 2 Since FGMRES and GMRESR are very similar, the ideas presented will be relevant for FGMRES as well. 1

2 or QMR [11], which satises some quasi-minimization requirement. Recently Vuik and Van der Vorst introduced a new type of method, GMRESR [18], which is a nested GMRES method. Among various other nice properties GMRESR is exible in the sense that a global minimization over some specic part of the Krylov space is done and the dimension of this space and the vectors that span it are controlled by the algorithm, see gure 1. GMRESR: 1. Select x 0, m, tol r 0 = b ; Ax 0, k =0 2. while kr k k 2 >toldo k = k +1 u k = P m k (A)r k;1 c k = Au k for i =1 ::: k; 1 do i = c T i c k c k = c k ; i c i u k = u k ; i u i c k = c k =kc k k 2 u k = u k =kc k k 2 x k = x k;1 + u k c T k r k;1 r k = r k;1 ; c k c T k r k;1 GCR: 1. Select x 0, tol r 0 = b ; Ax 0, k =0 2. while kr k k 2 > tol do k = k +1 u k = r k;1 c k = Au k for i =1,k ; 1 do i = c T i c k c k = c k ; i c i u k = u k ; i u i c k = c k =kc k k u k = u k =kc k k x k = x k;1 +(c T k r k;1)u k r k = r k;1 ; (c T k r k;1)c k P m k (A)r k indicates the GMRES polynomial, that is implicitly constructed in m steps of GMRES Figure 1: the GMRESR algorithm Figure 2: the GCR algorithm The GMRESR algorithm consists of GCR [7], gure 2, as the outer algorithm and m steps of GMRES as an inner method: the computation of P m k (A)r k;1 in Figure 1. The inner GMRES method computes a new search direction by approximately solving the residual equation and then the outer GCR algorithm minimizes the residual over the new search direction and all previously kept search directions u i. The algorithm can be explained as follows. Suppose we are given the system of equations Ax = b, wherea is a real, nonsingular, linear operator and b is a given right hand side. Furthermore we have the two matrices U k =(u 1 u 2 :::u k ) and = AU k (1) with the property that C T k = I k and we want to solve the following minimization problem: Then the solution of this minimization problem is given by min kb ; Axk 2 : (2) x2range(u k ) x = U k C T k b (3) and r = b ; Ax satises r = b ; C T k b r? range( ): (4) 2

3 In fact we have constructed a part of the inverse of A, more precisely the inverse of A over the subspace range( ). This inverse is given by A ;1 C T k = U k C T k : (5) This principle underlies the GCR method, see gure 2, where for the preconditioned case A denotes the explicitly preconditioned operator. In GCR the matrices U k and are constructed such, that the range(u k ) equals the Krylov space K k (A r 0 )=spanfr 0 Ar 0 ::: A k;1 r 0 g, where r 0 = b ; Ax 0 and x 0 is an initial approximation to the solution. Provided GCR does not break down, it is a nite method and at step k it solves the minimization problem (2), with range(u k )=K k (A r 0 ). It is obvious from equations (1)-(2), that in the step u k+1 = r k (in GCR) we can replace r k byany other vector and the algorithm still works in the sense that it computes the minimization problem (2). Only range(u k )=K k (A r 0 ) no longer holds. In general the better the choice of u k+1 the faster convergence will be. The optimal choice of course is u k+1 = e k,wheree k is the error in x k. Therefore the key idea is to choose a suitable approximation to the error e k. In order to nd approximations to e k,we use the relation Ae k = r k (6) and any method which gives an approximate solution to this equation can be used to nd good choices for u k+1.wemay also vary such methods from one step to the next and we can regard them as preconditioners. These observations were made in [18] and lead to the formulation of the GMRESR method, gure 1, or rather methods since many variants are possible. However, since we already have an optimal x 2 range(u k ), (b ; Ax)? range(au k ), we need an approximation u k+1 to e k, such that c k+1 = Au k+1 is also orthogonal to range( ). This is done explicitly by the orthogonalization loop in the outer (GCR) algorithm. Because this is not taken into account in the inner method, the `wrong' minimization problem is solved, since it also considers spurious corrections. Consider step k + 1 in GMRESR we have computed the matrices U k, and the vectors x k and r k. Then, in the inner GMRES loop (after m iterations) we solve the minimization problem and in the outer loop we set min kr y2ir m k ; AV m yk 2 (7) u k+1 = V m y ; U k C T k AV m y=k(i ; C T k )AV m yk 2 c k+1 = (I ; Ck T )AV m y=k(i ; Ck T )AV m yk 2 r k+1 = r k ; (rk T c k+1 )c k+1 : (8) It is obvious, that for optimality we should have solved the minimization problem which is equivalent to min r k ; (I ; C T y2ir m k )AV (9) my2 min (I ; C T y2ir m k )[r k ; AV m y] (10) 2 since r k? range( ). The results of the minimization problems (7) and (10) are equal only if range(av m )? range( ), which is generally not the case. Also in GMRESR we search 3

4 in principle the entire Krylov space K(A r k )=spanfr k Ar k A 2 r k :::g for an approximation u k+1 e k, whereas e k 2 K(A r k ) \ A ;1 range( )? (11) and therefore r k 2 K(A Ar k ) \ range( )? : (12) Another disadvantage of GMRESR is that the inner loop is essentially a restarted GMRES with the same operator A. It therefore also displays some of the problems of restarted GM- RES, most notably it can have the tendency to stagnate in the inner method (see Numerical Experiments). From this we infer,thatitisfavourable to preserve the GCR orthogonality relations also in the inner (GMRES) method, at least for convergence speed (not necessarily for overall performance), see theorem 1 in the next section. From equations (8), (9), (11) and (12) we come to the idea of using (I ; Ck T )A as the operator in the Krylov method in the inner loop. This would generate corrections to the residual c k+1 2 K(A b) \ range( )?.But, it will search for updates to the approximation u k+1 2fr k (I ; Ck T )Ar k :::grange( )? whereas in general e k 62 range( )? : We should note, however, that corrections to the residual c k+1 correspond to corrections to the approximation u k+1 = A ;1 c k+1 2 K(A r k ) \ A ;1 range( )? which corresponds to (11). This may seem like a problem, since we have introduced the operator A ;1. But over the space spanf(i ; C T k )Ar k [(I ; C T k )A]2 r k :::g the inverse of A is known, because we know the inverse of A over range( ) (5) : A ;1 (I ; C T k )A = A ;1 A ; A ;1 C T k A = I ; U k C T k A: (13) In Krylov methods the solution itself is not needed to continue the iteration, only the residual. Therefore we can use a Krylov method like iteration for the residual and nd the corresponding updates to the solution vector using (13). This leads to an extension of the GMRESR method, which has an improved performance for many problems. In fact this leads to a nested method, where the `inner' Krylov method computes an approximation to the error in the `outer' Krylov method and the information acquired in the `outer' method is used to speed up the convergence in the `inner' method. In this article we will consider as inner methods GMRES and BICGSTAB. BICGSTAB has the advantage that its cost per iteration is low, so that we have a relatively cheap method to compute approximations to the error. Using GMRES in the inner method has the following attractive features. First, at the end of the inner iteration the error is minimized over the space spanned by both the search vectors of the outer method and of the inner method, see theorem 1. Second, using a few parameters and truncation we can control the sizes of the Krylov subspaces in both the inner and the outer iteration and hence optimize the length of recurrences and memory requirements versus convergence speed, see [4] or [5] and [10]. In the next section we will discuss the implications of the orthogonalization in the inner method more formally. It will be proved that this leads to an optimal approximation over the space spanned by both the outer and the inner iteration vectors. It also introduces a potential problem: the possibility of breakdown in the generation of the Krylov space in the inner iteration, since we iterate with a singular operator. We will show, however, that such a breakdown is rare, because it can only happen after a specic (generally large) number of iterations. Furthermore, we will also show how to remedy such a breakdown. In most of the following discussion we 4

5 will consider m (not xed) steps of GMRES to be the inner method. Because the optimality of GMRES and the fact, that for a nonsingular operator GMRES does not break down, permit some statements about optimality and isolate the breakdown of the Krylov space generated by (I ; Ck T )A from the other possibilities of breakdown in various Krylov methods. 2 Consequences of inner orthogonalization We will now formally dene the elements of our discourse, repeating some of the previous section for the sake of clarity. Denition 1 The matrix A is a nonsingular, linear operator and the vector b is the given right hand side of the system of equations Ax = b. We dene the Krylov space K(B x) = spanfx Bx B 2 x :::g and the Krylov subspace K s (B x) =spanfx Bx B 2 x : : : B s;1 xg for any linear operator B and any vector x. Furthermore the matrices U k and satisfy the relations U k = (u 1 u 2 :::u k ) (14) = AU k with the property (15) C T k = I k (16) and u i c i 2 K(A b). We also have the vectors x k and r k satisfying Next, we dene the following operators x k : min kb ; Axk 2, (17) x2range(u k ) x k = U k C T k b (18) r k = b ; C T k b (19) r k? range( ): (20) P Ck = C T k (21) and A Ck =(I ; P Ck )A: (22) Because range( ) K(A b), K(A b) \ range( )? is an invariant subspace of A Ck. We use m (not xed) steps of the GMRES algorithm to compute the optimal approximation to r k in the space spanfa Ck r k A 2 r k ::: A m r k g = K m (A Ck A Ck r k ). This leads to the optimal approximation to the solution over the (global) space as follows. range(u k )+A ;1 K m (A Ck A Ck r k ) Theorem 1 Let A, U k,, r k, x k, A Ck and P Ck be as in denition 1. Let fr k A Ck r k A 2 r k ::: A m r k g be independent and fv 1 ::: v m+1 g be an orthonormal basis for K m+1 (A Ck r k ), with v 1 = r k =kr k k 2, generated bym steps of GMRES. This denes the relation A Ck V m = V m+1 Hm.Let y be dened by y : min ~y2ir m kr k ; A Ck V m ~yk 2 = min ~y2ir m kr k ; V m+1 Hm ~yk 2 : (23) 5

6 Then the minimal residual solution of the (inner) GMRES method: (A ;1 A Ck V m y) gives the outer approximation x k+1 = x k + A ;1 A Ck V m y (24) which is also the solution to the global minimization problem x opt : min kb ; A~xk 2 (25) range(v m) ~x2range(u k ) Proof: The solution x opt to the global minimization problem satises x opt : min ~x2range(u k ) range(v m) kb ; A~xk 2, x opt = U k z + V m y : min z2ir k y2ir m kb ; AU k z ; AV m yk 2 : (26) Furthermore, we have by construction of GMRES A Ck V m = V m+1hm, AV m = P Ck AV m + V m+1hm so that the minimization problem in (26) is equivalent to min kb ; z + C z2ir k k T AV m y ; V m+1hm yk 2 : (27) y2ir m Because range( )? range(v m+1 ), (27) is equivalent to z + C T k AV m y = C T k b, z = C T k (b ; AV m y) and by (19) (28) y : min ~y2ir m k(i ; C T k )b ; V m+1 Hm ~yk 2, y : min ~y2ir m kr k ; V m+1 Hm ~yk 2 : (29) This results in x opt = U k z + V m y (30) where y is given by (29), which isequivalent to(23)andz is given by (28). For x k+1 wehave, using (18), (23) and (24) x k+1 = x k + A ;1 (I ; C T k )AV m y (23, 24) = U k C T k b + V my ; U k C T k AV my (18) = U k C T k (b ; AV my)+v m y = U k z + V m y: 2 (31) It also follows from this theorem, that the GCR optimization (in the outer iteration) is given by (24), so that the residual computed in the inner GMRES iteration equals the residual of the outer GCR iteration: r k+1 = b ; Ax k+1 = b ; Ax k ; A Ck V m y = r k ; A Ck V m y = r k ; V m+1 Hm y (32) So the outer method only needs to compute the new u k+1 and c k+1 by c k+1 = (A Ck V m y)=ka Ck V m yk 2 (33) u k+1 = (A ;1 A Ck V m y)=ka Ck V m yk 2 (34) where A Ck V m y and A ;1 A Ck V m y =(I ;U k C T k A)V my have been computed already as the residual update resp. the solution in the inner iteration, see (23) and (24). The outer GCR method consequently becomes very simple. We will now consider the possibility of breakdown, generating a Krylov space with a singular operator. The following theorem may not be a surprising one, but it is given because it captures the essence of the following discussion. 6

7 Theorem 2 Let B be alinear operator, fy By B 2 y : : : B m yg is independent and K m+1 (B y) is an invariant subspace ofb so that fy By B 2 y : : : B m+1 yg is dependent. Then 1. B is nonsingular over K m+1 (B y) and therefore has an inverse over this space ifandonly if fby B 2 y : : : B m+1 yg is independent. 2. Let B m+1 y be givenby B m+1 y = mx i=0 i B i y (35) where B 0 = I. Then B is nonsingular over spanfb j y B j+1 y : : : B m yg and therefore B has an inverse over this space if and only if in (35) j 6=0and i =0, 0 i j ; Let B m+1 y be given by (35) and j 6=0and i =0, 0 i j ; 1. Then B is singular over spanfb j;1 y B j y ::: B m yg. Morespecically and therefore (B j;1 y ; z) 2 null(b): 9z 2 spanfb j y ::: B m yg : Bz = B j y Proof: 1. Assume that B is nonsingular over K m+1 (B y), then dim BK m+1 (B y) = dim K m+1 (B y) and hence fby B 2 y ::: B m+1 yg is independent. Conversely, assume that fby B 2 y ::: B m+1 yg is independent. Then 8x 2 BK m+1 (B y): x = P m i=0 i B i+1 y, where the i are uniquely determined and we can construct ~x = P mi=0 i B i y, so that B~x = x. Because the i are unique and both fy By ::: B m yg and fby B 2 y : : : B m+1 yg are independent, this denes a one-to-one correspondence and hence B must be nonsingular over K m+1 (B y). 2. First assume that in (35) j 6= 0 and i = 0 0 i j ; 1. Then B m+1 y 62 spanfb j+1 y ::: B m yg since fb j y : : : B m yg is independent. Let y = B j y, then fy By ::: B m;j yg is independent, K m+1;j (B y) is an invariant subspace of B and fby B 2 y ::: B m+1;j yg is independent. So that by part 1 of this theorem we can conclude that B is nonsingular over K m+1;j (B y) =spanfb j y B j+1 y : : : B m yg and therefore B has an inverse over this space. Second assume that spanfb j y B j+1 y ::: B m yg is an invariant subspace of B and B is nonsingular over this space. Then obviously B m+1 y 2 spanfb j y B j+1 y : : : B m yg so that in (35) i =0 0ij ; 1. Again let y = B j y so that fy By ::: B m;j yg is independent and K m+1;j (B y) isaninvariant subspace of B. Then by part 1 of this theorem fby B 2 y ::: B m+1;j yg = fb j+1 y B j+2 y : : : B m+1 yg is also independent. Therefore B m+1 y 62 spanfb j+1 y B j+2 y ::: B m yg so that j 6=0. 3. Since B is nonsingular over spanfb j y B j+1 y : : : B m yg, B has an inverse over spanfb j y B j+1 y ::: B m yg, so that part 3 of the theorem follows trivially. 2 Now although GMRES is still optimal in the sense that at each iteration it computes the minimal residual solution over the generated Krylov space, the generation of the Krylov space itself, from a singular operator, may break down. The following simple example shows, that this 7

8 may happen before the solution is found, even though both the solution and the right hand side are in the range of the given (singular) operator. Dene the matrix A: A =(e 2 e 3 e 4 0) where e i denotes the i th Cartesian basis vector. Note that A =(I ; e 1 e T 1 )(e 2 e 3 e 4 e 1 ), which is the same type of operator as A Ck, an orthogonal projection times a nonsingular operator. Now consider the system of equations Ax = e 3. Then GMRES (or any other Krylov method) will search for a solution in the space spanfe 3 Ae 3 A 2 e 3 :::g = spanfe 3 e :::g so we have a breakdown of the Krylov space and the solution is not contained in it, even though the solution exists, is an element of the range of A: Ae 2 = e 3 and the right handsideisinthe orthogonal complement of null(a). So the singular unsymmetric case is quite dierent from the symmetric one. We will now dene breakdown of the Krylov space for the inner GMRES iteration. Denition 2 Let A, U k,, A Ck and r k be as in denition 1. Let fr k A Ck r k ::: A m;1 r k g be independent and fv 1 v 2 ::: v m g be an orthonormal basis for K m (A Ck r k ),withv 1 = r k =kr k k 2, generated bym ; 1 steps of GMRES. We say to have a breakdown of the Krylov subspace if A Ck v m 2 range(v m ), since this implies we can no longer expand the Krylov subspace. We callitalucky breakdown if v 1 2 range(a Ck V m ),because we then have found the solution (the inverse of A is known over the space range(a Ck V m )). We call it a true breakdown if v 1 62 range(a Ck V m ),because then the solution is not contained in the Krylov subspace. The following theorem relates true breakdown to the invariance of the sequence of subspaces in the inner method for the operator A Ck. Part 3 indicates, that it is always known, whether a breakdown is true or lucky. Theorem 3 Let A, U k,, A Ck and r k be as in denition 1. Let fr k A Ck r k ::: A m;1 r k g be independent and fv 1 v 2 ::: v i g be an orthonormal basis for K i (A Ck r k ),fori =1 m,with v 1 = r k =kr k k 2, generated bym ; 1 steps of GMRES. Then at step m: 1. A true breakdown occurs if and only if range(a Ck V m;1 ) is an invariant subspace ofa Ck. 2. A true breakdown occurs if and only if A Ck v m 2 range(a Ck V m;1 ). 3. A breakdown occurs if and only if we can dene H m by A Ck V m = V m H m, furthermore it is a true breakdown if and only if H m is singular. Proof: 1. Assume that a true breakdown occurs. By assumption fr k ::: A m;1 r k g is independent and breakdown implies, that K m (A Ck r)isaninvariant subspace of A Ck. So v 1 62 range(a Ck V m ) implies that A Ck is singular over K m (A Ck r k ). So by theorem 2 fa Ck r k ::: A m r k g is dependent and therefore A m r k 2 spanfa Ck r k ::: A m;1 r k g = range(a Ck V m;1 )andrange(a Ck V m;1 )isaninvariant subspace of A Ck. Conversely assume that range(a Ck V m;1 )=spanfa Ck r k ::: A m;1 r k g is an invariant subspace of A Ck. Obviously then spanfr k ::: A m;1 r k g is an invariant subspace of A Ck,so that a breakdown will occur. Furthermore range(a Ck V m )= spanfa Ck r k ::: A m r k g 8

9 and since A m r k 2 spanfa Ck r k ::: A m;1 r k g by assumption, we have range(a Ck V m ) = range(a Ck V m;1 ). Because fr k ::: A m;1 r k g is independent by assumption, r k 62 range(a Ck V m ) so that a true breakdown occurs. 2. Assume a true breakdown occurs. Then by part 1 of this proof range(a Ck V m;1 )is an invariant subspace and hence range(a Ck V m )=range(a Ck V m;1 ), so that A Ck v m 2 range(a Ck V m;1 ). Assume, on the other hand, that A Ck v m 2 range(a Ck V m;1 ). Then A Ck v m 2 spanfa Ck r k ::: A m;1 r k grange(v m ), so that a breakdown occurs. Furthermore A Ck v m 2 range(a Ck V m;1 ) implies range(a Ck V m )=range(a Ck V m;1 ). Because fr k ::: A m;1 r k g is independent by assumption, r k 62 range(a Ck V m;1 )=range(a Ck V m ), consequently a true breakdown occurs. 3. Assume rst that a breakdown occurs. Then by denition 2, A Ck v m 2 range(v m ), GMRES denes the relation A Ck V m;1 = V mhm;1 and therefore we can dene an H m 2 IR mm such that A Ck V m = V m H m. Second, assume that an H m exists, which satises A Ck V m = V m H m. Then A Ck v m 2 range(v m ) so that a breakdown occurs. Assume that a true breakdown occurs. Let H m be nonsingular, then 9x : H m x = e 1, where e 1 denotes the rst Cartesian basis vector, so that v 1 = V m H m x = A Ck V m x ) v 1 2 range(a Ck V m ): This contradicts our assumptions, which implies that H m must be singular. Assume that H m is singular. Then dim V m H m = dim A Ck V m < m, which implies that fa Ck r k ::: A m r k g is dependent. This, in turn leads to range(a Ck V m ) = spanfa Ck r k ::: A m;1 r k g = range(a Ck V m;1 ), so that a true breakdown occurs (see above or theorem 2. 2 From the previous theorem and its proof, one can already conclude that a true breakdown occurs, if and only if A Ck is singular over K m (A Ck r k ). From denition 1 we know null(a Ck )= range(u k ). We will make this more explicit in the following theorem, which relates true breakdown to the intersection of the inner search space and the outer search space. Theorem 4 Let A, U k,, A Ck and r k be as in denition 1. Let fr k A Ck r k ::: A m;1 r k g be independent and fv 1 v 2 ::: v i g be an orthonormal basis for K i (A Ck r k ),fori =1 m,with v 1 = r k =kr k k 2, generated bym ; 1 steps of GMRES. A true breakdown occurs at step m if and only if 9u 6= 02 range(v m ):u2 range(u k ): Proof: Let u 6= 0 2 range(u k ) and u 2 range(v m ). By denition 1, A Ck u = 0, consequently dim(range(a Ck V m )) <m)fa Ck r k ::: A m r k g is dependent. This implies (see proof of theorem 3 or theorem 2) that range(a Ck V m;1 )isaninvariant subspace of A Ck, so that a true breakdown occurs. Assume that a true breakdown occurs. Then by theorem 3 range(a Ck V m;1 )isaninvariant subspace of A Ck, which implies that fa Ck r k ::: A m r k g is dependent. Therefore 9u 6= 0 2 spanfr k ::: A m;1 r k g = range(v m ):u 2 null(a Ck ) ) u 2 range(u k ): 2 9

10 The following theorem indicates, that breakdown cannot occur in the inner GMRES method, before the total number of iterations exceeds the dimension of the Krylov space K(A b). This means that, in practice, a breakdown will be rare. Theorem 5 Let A, U k,, A Ck and r k be as in denition 1. Let P m = dim(k(a b)), that is m = maxfs : fb Ab : : : A s;1 s bg is independent. Dene P s (A)b = i=0 i A i b with s 6=0 let u i = P li ;1(A)b i =1 ::: k l = max i=1 k l i <m then r k =(I ; P Ck )b = P l (A)b. Wecall l the total number of iterations. Then if j + l<m, j<m; l then fr k A Ck r k ::: A j r k g is independent : and therefore no breakdown occurs in the rst j steps of GMRES. Proof: By denition c i = Au i ) c i 2 spanfab ::: A l bg: Further r k = P l (A)b ) A Ck r k = (I ; P Ck )P 0 (A)b = P l+1 l+1(a)b and analogously we nd A i r k = P l+i (A)b for i =1 ::: j: So that we have fr k A Ck r k ::: A j r k g = fp l (A)b P l+1 (A)b : : : P l+j (A)bg is independent by denition of m. Therefore no breakdown can occur. 2 We will now show how a true breakdown can be overcome. There are basically two ways to continue: In the inner iteration: by nding a suitable vector to expand the Krylov space. In the outer iteration: by computing the solution of the inner iteration just before breakdown and continue by making one LSQR-step (see below) in the outer iteration. Continuation in the inner iteration. The following theorem indicates how we can nd a suitable vector to expand the Krylov subspace. Theorem 6 Let A, U k,, A Ck and r k be as in denition 1, let fr k A Ck r k ::: A m;1 r k g be independent and fv 1 v 2 ::: v i g be an orthonormal basis for K i (A Ck r k ),fori =1 m,with v 1 = r k =kr k k 2, generated bym ; 1 steps of GMRES. If a true breakdown occurs then which implies 9c 2 range( ):A Ck c 62 range(a Ck V m;1 ) 9c i 2fc 1 ::: c k g : A Ck c i 62 range(a Ck V m;1 ): Proof: By theorem 3 we have A Ck v m 2 range(a Ck V m;1 ). Now assume, that we would stop after the last regular GMRES step, (m ; 1), and compute r k+1 using (32) and c k+1 using (33). Then from r k+1 = r k ; A Ck V m;1 y = kr k k 2 v 1 ; V mhm;1 y, wehave r k+1 2 range(v m ) and therefore A Ck r k+1 2 range(a Ck V m;1 ). Because A is a nonsingular operator r k+1 2 spanfar k+1 ::: A p r k+1 g for some p ) r k+1 = P p j=1 ja j r k+1 ) r k+1 =(I ; P Ck )r k+1 = P p j=1 j(i ; P Ck )A j r k+1 and since r k+1 6=0,r k+1? range( )andr k+1? range(a Ck V m;1 ): 9j :(I ; P Ck )A j r k+1 6= 0 and (I ; P Ck )A j r k+1 62 range(a Ck V m;1 ): 10

11 A Ck r k+1 2 range(a Ck V m;1 ),now let s be dened as s :(I ; P Ck )A s r k+1 62 range(a Ck V m;1 ) 1 j<s:(i ; P Ck )A j r k+1 2 range(a Ck V m;1 ) and then (I ; P Ck )A s r k+1 =(I ; P Ck )A(I ; P Ck )A s;1 r k+1 +(I ; P Ck )AP Ck A s;1 r k+1 where by assumption (I ; P Ck )A s;1 r k+1 2 range(a Ck V m;1 ) ) (I ; P Ck )A(I ; P Ck )A s;1 r k+1 2 range(a Ck V m;1 ). This implies that (I ; P Ck )AP Ck A s;1 r k+1 62 range(a Ck V m;1 ) and since P Ck A s;1 r k+1 2 range( )wehave which in turn implies 9c 2 range( ):A Ck c 62 range(a Ck V m;1 ) 9c i 2fc 1 ::: c k g : A Ck c i 62 range(a Ck V m;1 ): 2 Not any c 2 range( ) will do, as is shown by the following example: Let u 1 = b, then A Ck r k = A Ck (b ; C T k b) =;A Ck C T k b: Therefore we simplymust try the c i until one of them works. Continuation in the outer iteration. Another way to continue after a true breakdown in the inner GMRES iteration is to compute the inner iteration solution just before the breakdown and apply an LSQR-switch (see below) in the outer GCR iteration. The following theorem states the reason why the LSQR-switch must be applied. Theorem 7 Let A, U k,, A Ck and r k be as in denition 1, let fr k A Ck r k ::: A m;1 r k g be independent and fv 1 v 2 ::: v i g be an orthonormal basis for K i (A Ck r k ), for i = 1 m, with v 1 = r k =kr k k 2, generated bym ; 1 steps of GMRES. If a true breakdown occurs at step m and we compute the solution at the previous GMRES step and switch to the outer method and make one GCR step (24), (32), (33) and (34). So we compute r k+1 =r k ; V mhm;1 y and c k+1 =A Ck V m;1 y=ka Ck V m;1 yk. Then we will have stagnation in the next inner GMRES iterations, that is r k+1? spanfa Ck+1 r k+1 A 2 +1 r k+1 :::g: Proof: We have c k+1 2 range(a Ck V m;1 ), furthermore r k+1 2 range(v m )anda Ck r k+1 2 range(a Ck V m;1 )by theorem 3. A Ck+1 =(I ;c k+1 c T )A k+1 and from this and theorem 3 follows that range(a Ck V m;1 ) is also a invariant subspace of A Ck+1 and A Ck+1 r k+1 2 range(a Ck V m;1 ): 2 The reason for the stagnation is, that the new residual r k+1 remains in the same Krylov space K(A Ck r k ), which contains a u 6= 02 range(u k ). So we have to `leave' this Krylov space. We can do this using the so-called LSQR-switch, which was introduced in [18] to remedy stagnation in the inner method. Just as in the GMRESR method, stagnation in the inner method will result in a breakdown in the outer GCR method, because the residual cannot be updated. The following theorem indicates, that the LSQR-switch always works. 11

12 Theorem 8 If stagnation occurs in the inner GMRES method, that is then we can continue by (LSQR switch) setting min ~y2ir m;1 kr k+1 ; A Ck V m;1 ~yk 2 = kr k+1 k 2 c k+2 = (I ; +1 C T k+1)aa T r k+1 and u k+2 = (A ;1 (I ; +1 C T k+1)aa T r k+1 where is a normalization constant. This leads to r k+2 = r k+1 ; (r T k+1c k+2 )c k+2 and x k+2 = x k+1 +(r T k+1c k+2 )u k+2 which always gives an improved approximation. Therefore these vectors can also be used as the start vectors for a new Krylov subspace in the inner GMRES method instead if desired. Proof: (following the proof in [18]) because r k+1? range(+1 ) and so c T k+2r k+1 = r T k+1aa T r k+1 c T k+2r k+1 = ka T r k+1 k 2 2 6= 0: So this step will always give an improvement to the residual. This also proves, that (I ; +1 C T k+1 )AAT r k+1 62 range(a Ck V m;1 ), because r k+1? range(a Ck V m;1 ), so that we may also use this vector to build a Krylov space in the inner iteration after a true breakdown in the previous inner iteration. 2 3 Implementation A straightforward implementation of the GCR-method and an orthogonalizing inner method will obviously result in a large number of vector updates, innerproducts and vector scalings. We will therefore describe an implementation which strongly reduces this work. This implementation is mainly due to [10]. First we will consider the outer GCR method, next the inner GMRES method and nally the implementation of BICGSTAB as the inner algorithm. Instead of the matrices U k and (see denition 1) we will use in the actual implementation the matrices ^Uk, Ck, N k, Z k and d k which are dened below. Denition 3 Let A, U k,, b and r k be as in denition 1, then ^Uk, Ck, N k, Z k and d k are dened as follows. = Ck N k where (36) N k = diag(kc 1 k ;1 kc 2 2k ;1 ::: kc 2 kk ;1 2 ) (37) A ^U k = C k Z k (38) where Z k is assumed tobe upper-triangular. Finally d k is dened bytherelation r k = b ; d k (39) 12

13 From this the approximate solution x k, corresponding to r k, is implicitly represented as x k = ^Uk Z ;1 k d k: (40) Using this relation x k can be computed at the end of the complete iteration or before truncation (see end of this chapter). The implicit representation of U k saves all the intermediate updates of previous u i to a new u k+1, which saves about 50% of the computational costs in the outer GCR iteration. Initialization and Restart. If the method is started with an initial guess x 0 we can compute r 0 = b ; Ax 0 and then continue analogously as before and change the updates to r k = r 0 ; C k d k and x k = x 0 + ^U k Z ;1 k d k Also after a restart (see Convergence check at the end at the end of this section) we compute a new result by adding a correction to the previously computed x k. GMRES as inner iteration. Assume we have madek outer iterations, so that ^U k, C k and r k are given. Then, in the inner GMRES iteration the orthogonal matrix V m+1 is constructed such that C k T V m+1 = O and AV m = Ck B m + V m+1 Hm (41) B m = N 2 k C T k AV m (42) This algorithm is equivalent to the usual GMRES algorithm, except that the vectors Av i are rst orthogonalized on Ck.From (41) and (42) it is obvious that AV m ; Ck B m = A Ck V m = V m+1 Hm, cf. theorem 1. Next we compute y according to (23) and we set (cf. (33) without normalization):! c k+1 = V m+1 Hm y (43) ^u k+1 = V m y (44) and this leads to A^u k+1 = AV m y = Ck B m y + V m+1hm y = Ck B m y +c k+1, so that if we set z k+1 = B my the relation A 1 ^U k+1 = C k+1 Z k+1 is again satised. It follows from theorem 1 that the new residual of the outer iteration equals the nal residual of the inner iteration r k+1 = rm inner and is given by r k+1 = r k ; c k+1 (45) so that d k+1 =1. Obviously the residual norm only needs to be computed once. From (45), if we replace the new residual of the outer iteration r k+1 by the residual of the inner iteration, we see an important relation, that holds more generally r inner m c k+1 = r k ; r inner m : (46) This relation is important, since in general (when other Krylov methods are used for the inner iteration) c k+1 or c k+1 cannot be computed from u k+1, because u k+1 is not computed explicitly, nor does a relation like (43) exist. However the (inner) residual is always known, because it is needed for the inner iteration itself, see also the part on BICGSTAB. Therefore in our current implementation (45) is replaced by r k+1 = r inner m = r k ; V m+1 Hm y (47) 13

14 GMRES with A Ck (m steps): (after k outer iterations) 1. Select tol v 1 = r k =kr k k 2 2. for j =1 ::: mdo v j+1 = Av j for i =1 ::: k do b i j = c T i v j+1 v j+1 = v j+1 ; b i j c i for i =1 ::: j do h i j = vi T v j+1 v j+1 = v j+1 ; h i j v i h j+1 j = kv j+1 k 2 v j+1 = h ;1 j+1 j v j+1 Solve min ym kkr k k 2 e 1 ; Hm y m k 2 ^u k+1 = V m y m r inner = r k ; V m+1hm y m : Z 1::k k+1 = B m y m GCRO, generic version: 1. Select x 0, tol r 0 = b ; Ax 0, k =0 2. while kr k k 2 > tol do call inner(! ^u k+1 r inner Z 1::k k+1 ) c k+1 = r k ; r inner (N k+1 ) k+1 = kc k+1 k 2 Z k+1 k+1 =1 d k+1 =(c T r k+1 k)=(n k+1 ) 2 k+1 r k+1 = r k ; d k+1 c k+1 k = k +1 x k = x 0 + ^Uk Z ;1 k d k GCRO, with GMRES: 1. Select x 0, tol r 0 = b ; Ax 0, k =0 2. while kr k k 2 > tol do call gmres(! ^u k+1 r inner, kr inner k 2 Z 1::k k+1 ) c k+1 = r k ; r inner (N k+1 ) k+1 =(kr k k 2 ;kr inner k 2 ) 1=2 Z k+1 k+1 =1 d k+1 =1 r k+1 = r inner k = k +1 x k = x 0 + ^U k Z ;1 k d k Figure 3: with A Ck the inner GMRES algorithm Figure 4: GCRO, a generic version and a special version for inner GMRES with A Ck and (43) by (46). Finally, we need to compute the new coecient ofn k+1, kc k+1 k ;1 2, in order to satisfy the relations in denition 3. The outer iteration then consists only of (45) or (47), d k+1 = 1, the computation of kc k+1 k ;1 2 and (40) at the end. Algorithms for nested GMRES with inner orthogonalization. In gure 3 and gure 4 we give an outline of the inner GMRES algorithm for A Ck and two versions of the outer GCR algorithm: GCRO (because of the implicit inner orthogonalization). One generic version with an arbitrary inner method, with the only requirement, that on output the relation A^u k+1 = r k ; r inner + C k Z 1::k k+1 holds. The other version explicitly uses the relations, which hold after an inner iteration of (m steps of) GMRES, to make a few more optimizations. However, for some problems (with GMRES for A Ck as the inner method) the generic version turns out to be slightly more stable and is then to be preferred. For the experiments discussed later, we used the generic version, even with an inner GMRES for A Ck. BICGSTAB as inner iteration. For the sake of simplicity we will only consider nonpreconditioned BICGSTAB here (or explicitly preconditioned). The algorithm as given in [17] is listed in gure 5. The algorithm contains two matrix vector multiplications with A, which 14

15 BICGSTAB: 1. Select x 0, tol r 0 = b ; Ax 0,^r 0 = r 0 0 = =! 0 =1 v 0 = p 0 =0,i =0 2. while kr i k 2 >toldo i = i +1 i =(^r 0 r i;1 ) =( i = i;1 )(=! i;1 ) p i = r i;1 + (p i;1 ;! i;1 v i;1 ) v i = Ap i = i =(^r 0 v i ) s = r i;1 ; v i t = As! i =(t s)=(t t) x i = x i;1 + p i +! i s r i = s ;! i t BICGSTAB in GCR with orthogonalization: ( after k outer iterations ) 1. Select x 0, maxit, tol, rtol r 0 = b ; Ax 0,^r 0 = r 0 0 = =! 0 =1 v 0 = p 0 =0,i =0 2. while kr i k 2 > max(rtol kr 0 k 2 tol) and i < maxit do i = i +1 i =(^r 0 r i;1 ) =( i = i;1 )(=! i;1 ) p i = r i;1 + (p i;1 ;! i;1 v i;1 ) v i = Ap i y 1 = N 2 k C T k v i v i = v i ; Ck y 1 = i =(^r 0 v i ) s = r i;1 ; v i t = As y 2 = N 2 k C T k t t = t ; Ck y 2! i =(t s)=(t t) x i = x i;1 + p i +! i s r i = s ;! i t z k+1 = z k+1 + y 1 +! i y 2 Figure 5: the BICGSTAB algorithm Figure 6: the BICGSTAB algorithm in GCR with orthogonalization must be replaced by A Ck. The coecients of the orthogonalizations must be saved to satisfy relation (38) at the end of the iteration. This avoids the explicit correction of x i, which will become ^u k+1 in the outer iteration. The implementation is as follows. On initialization we set z k+1 =0,x 0 = 0 and hence r 0 = rk outer. After the statement v i = Ap i we add: and after t = As: y 1 = N 2 k C T k v i v i = v i ; Ck y 1 y 2 = N 2 k C T k t t = t ; y 2 Further, at the end of the iteration, we setz k+1 = z k+1 + y 1 +! i y 2. From this it is easily veried that the relation Ax i = r 0 ; r i + C k z k+1 is satised at the end of each iteration. If some stopping criterion is satised after step i we set ^u k+1 = x i (48)! c k+1 = r 0 ; r i = rk outer ; ri inner cf. (46) (49) z k+1 = z k+1 1 (50) and the (outer iteration) relation (38) is again satised. In this case we have to compute the residual update explicitly. By construction the columns of Ck+1 are orthogonal. So we only 15

16 have to compute kc k+1 k ;1 2 and set: N k+1 = diag(kc 1 k ;1 2 k+1k ;1 2 (51) d k+1 = kc k+1 k ;2 2 c T k+1r k (52) r k+1 = r k ; d k+1 c k+1 (53) After this the relations of denition 3 are again satised. Experience with BICGSTAB as inner iteration indicates, that limiting the number of inner iterations and using a large relative tolerance (w. r. t. to the outer residual norm), gives the optimal interplay between the outer and inner iteration. An often employed strategy is to have a maximum of 25 inner iterations and a relative tolerance of 10 ;2. The inner algorithm is given in gure 6. Truncation in general. When the number of outer iterations becomes large, the multiplication by A Ck will be expensive and we mayrunoutofmemory. Therefore we consider the truncation of the outer iteration. We will give only a general description here, for a more detailed discussion see [4] or [5] and [10]. We choose a matrix W l 2 IR kl,wherel<k, such that: There are now two ways to implement the truncation: W T l W l = I l : (54) 1. First we compute the current approximation according to (40): x k = ^U k Z ;1 k d k, then we compute the new matrices U l and C l as follows: C l = W l = Ck N k W l = Ck Wl where Wl = N k W l (55) U l = U k W l = ^U k Z ;1 k In order to satisfy the relations of denition 3 we set W l (56) x 0 = x k r 0 = r k C l = C l ^Ul = U l Z l = I l d l =0 N l = diag (1 ::: 1) (57) and we replace relation (39) by r l = r 0 ; C l d l, after this we have to add corrections to x 0 so that (40) must be replaced by x l = x 0 + ^Ul Z ;1 l d l 2. The second method does not require the computation of x k and is therefore more in line with the overall algorithm. Instead we usew 1 to compute an implicit representation of x k. w 1 = ;1 N ;1 k d k where = kn ;1 k d kk 2 (58) The other vectors of W l can then be chosen freely as long as they satisfy (54). Furthermore we compute Cl, ^Ul, N l, Z l and r l according to (55)-(57). Finally we setd l = e 1, so that the relation r l = b ; C l d l is satised: r l = b ; C l d l = b ; N k w 1 = b ; N k ;1 N ;1 k d k = b ; d k = r k : Then x l is implicitly represented by x l = ^U l Z ;1 d l (= U l d l ). 16

17 Parallel implementation. Both the GMRESR method and the method just described (GCRO) will run eciently on (large) distributed memory computers if they use a variant of GMRES, that reduces the cost of global communication in the innerproducts, as described in [3] and [2]. Furthermore GCRO with inner GMRES has the advantage that it converges in almost the same number of iterations as (full) GMRES (see Numerical Experiments), however it has much less innerproducts, which require expensive global communication. GCRO with inner GMRES also uses much less memory than GMRES for an equal number of iterations (matrix vector products). Because the cost of global communication is often the bottleneck for fast parallel implementations of GMRES on large distributed memory computers and on these computers memory restrictions can be a severe constraint, GCRO with inner GMRES will perform much better than GMRES for many problems on large distributed memory computers. Convergence check at the end. Since the residual is never explicitly computed from the approximate solution, it is advisable to check at the end, when the solution is nally computed, the true residual. If the norm of the true residual turns out to be larger then the prescribed tolerance, we rst reorthogonalize the true residual on C k and compute the corresponding approximate solution. Generally this is sucient to get the norm of the true residual below the prescribed tolerance. If it is not then the process can be simply restarted, while keeping the old C k and ^U k in order to preserve optimality in these directions. It turns out that (to our experience) only a few inner iterations suce. 4 Numerical Experiments We will discuss the results of three numerical experiments, which concern the solution of two dimensional convection diusion type problems on regular grids, discretized by a nite volume technique, resulting in a pentadiagonal matrix. The system is preconditioned with ILU applied to the scaled system, see [6],[12]. These problems are solved by the following iterative methods: (full)gmres BICGSTAB GMRESR(m): m indicates the number of inner GMRES iterations for each outer iteration. GCRO(m), which is GCR with m GMRES iterations for A Ck as inner method. GMRESRSTAB: GMRESR with BICGSTAB as inner method. GCROSTAB: GCR with BICGSTAB for A Ck as inner method. We will compare the convergence of these methods both with respect to time (on one processor of the Convex C3840) and with respect to the numberofmatrixvector multiplications. This makes sense since the main trade o between (full)gmres, the GCRO variants and the GMRESR variants is less iterations against less work per iteration. Which one converges faster in time, then highly depends on the relative cost of the matrix vector multiplication and preconditioning. Problem 1. The rst problem solves the equation on the unit square, where b is given by ;(u xx + u yy )+b(u x + u y )=f b(x y)=( 1 for (x y) 2 [0:5 0:6] [0:5 0:6] 50 elsewhere and homogeneous Dirichlet boundary conditions. The right hand side is chosen such that the solution is given by u(x y) = sin(x) sin(y). The stepsize in both x; and y;direction is 1=

18 log( r ) (full)gmres gcro(m) gmresr(m) number of iterations (#matvec) Figure 7: convergence vs. number of iterations for problem 1 The convergence history for problem 1 is given in gures 7 and 8 for GMRES, GCRO(m) and GMRESR(m), for m =5andm = 10. Figure 7 shows that GMRES converges fastest (in iterations), which is of course to be expected, followed by GCRO(5), GMRESR(5), GCRO(10) and GMRESR(10). From gure 7 we can see that GCRO(m) converges much smoother and faster than GMRESR(m) and follows the convergence of GMRES relatively well. The vertical `steps' of GMRESR(m) are caused by the optimization in the outer iteration, which does not involve a matrix vector multiplication. This gure also shows another important dierence the GM- RESR(m) variants tend to lose their superlinear convergence behaviour, at least during certain stages of the convergence history. This seems to be caused by stagnation or slow convergence in the inner iteration, which (of course) essentially behaves like a restarted GMRES. Figure 8 gives the convergence with respect to time. GCRO(5) is the fastest, which is not surprising in view of the fact, that it converges almost as fast as GMRES, but against much lower cost per iteration. Also we see that GCRO(10), while slower than GMRESR(5) is still faster than GMRESR(10). This shows, that for this example the improvement in convergence outweighs the extra work in orthogonalizations. Another feature, which can be seen from this last gure, is that (at least in time) relatively short inner iterations are best for GCRO(m) and GMRESR(m). We will also see this in the other examples. Apparently the interplay between outer- and inner method and a suciently large number of of outer iterations is important. Problem 2. The second problem is dened by the discretization of on [0 1] [0 4], where ;(u xx + u yy )+bu x + cu y =0 b(x y)= 8>< >: 100 for 0 y 1 ;100 for 1 <y for 2 <y 3 ;100 for 3 <y 4 and c = 100. The boundary conditions are u =1ony =0,u =0ony =4,u 0 =0onx =0and u 0 =0onx = 1, where u 0 denotes the (outward) normal derivative, see gure 9. The stepsize in x-direction is 1=100 and in y-direction is 1=50. 18

19 log( r ) (full)gmres gcro(m) gmresr(m) bicgstab time (s) Figure 8: convergence vs. time for problem 1 1 u=0 u =0 1 x b b u=1 u=0 b b 0 1 u = y u=1 a=1.e4 a=1.e-5 a= u=1 1 u=1 f=100 Figure 9: Problem 2 Figure 10: Problem 3 The results of problem 2 are given in gures 11 and 12. In this example, the dierence between (full) GMRES and GCRO(5) in convergence against number of matrix vector multiplications is even less than in the previous one and GCRO(5) converges (almost) exactly like GMRES. Figure 11 shows very well the dierence in convergence behaviour between GMRESR(m) and GCRO(m). GMRESR(m) shows a tendency to stagnate in the inner iteration (especially for m = 10) and therefore converges much slower (loss of superlinear convergence) than GCRO(m). GCRO(m) preserves the superlinear convergence of GMRES quite well. This is explained by the `global' optimization in GCRO(m) over both the inner and the outer search vectors (the latter form a `best' sample of the entire, previously searched Krylov subspace). So we mayviewthisas a semi-full GMRES. Figure 12 shows that, again, GCRO(5) is the fastest in CPU-time, but that GMRESR(5) is faster than GCRO(10), although they take about the same number of iterations. Also observe that GMRES is the slowest method in CPU-time, due to the large number of orthogonalizations. This illustrates the trade-os between extra work (orthogonalizations) and faster convergence. Problem 3. The third problem is taken from [17]. It solves the equation ;(au x ) x ; (au y ) y + bu x = f 19

20 log( r ) (full)gmres gcro(m) gmresr(m) number of iterarions (#matvec) Figure 11: convergence vs. number of iterations for problem 2 on the unit square, where b is given by b =2e 2(x2 +y 2 ). Along the boundaries we have Dirichlet boundary conditions, see gure 10. The functions a and f are also given in gure 10 f =0 everywhere, except for the small square in the centre, where f = 100. The stepsize in both x; and y;direction is 1=128. The results of problem 3 are given in gures 13, 14 and 15. Figure 13 gives the convergence plot for GMRES, GCRO(m) and GMRESR(m). Here we used m = 10 (optimal) and m = 50 to highlight the dierence in convergence behaviour in the inner (GMRES) iteration of GMRESR(m) and GCRO(m). GMRESR(50) stagnates in the inner GMRES iteration, whereas GCRO(50) displays almost the same convergence behaviour as GCRO(10) and full GMRES. In gure 14 the convergence plot is given for GMRES, BICGSTAB and BICGSTAB as the inner method in GMRESR (GMRESRSTAB) and GCRO (GCROSTAB). The following strategy gave the best results for the BICGSTAB variants: For GMRESRSTAB the inner iteration was ended after either 20 steps or a relative reduction of the residual norm by a factor 0:01. For GCROSTAB the inner iteration was ended after 25 steps or a relative reduction of the residual norm by a factor 0:01. The convergence behaviour of GMRESRSTAB for this example is somewhat typical for GM- RESRSTAB in general (albeit very bad in this particular case). A reason for this erratic behaviour may be, that the convergence of BICGSTAB depends on the (implicit) (bi-)orthogonality to some basis. This relation is destroyed after each outer iteration. Now if the inner iteration does not yield a good approximation, the outer iteration will not give much improvement either and the method becomes `trapped'. At the start, the same seems to hold for GCROSTAB, however, after a few outer GCR iterations the `improved' operator A Ck somehow yields a better convergence than BICGSTAB by itself. We have also observed this for other tests, although it may also happen, that GCROSTAB converges worse than BICGSTAB. In gure 15 the convergence versus CPU-time is given for GCROSTAB, BICGSTAB, GCRO(10) and GMRESR(10). GCROSTAB gives the best convergence in time, it is approximately 20% faster than BICGSTAB, notwithstanding the extra work in orthogonalizations. Although GCRO(10) converges in less iterations than GMRESR(10), in time GMRESR(10) is faster. So in this case the decrease in 20

21 log( r ) (full)gmres gcro(m) gmresr(m) time (s) Figure 12: convergence vs. time for problem 2 iterations does not outweigh the extra work in orthogonalizations. For completeness we mention, that GMRESRSTAB took almost 15 seconds to converge, whereas GMRES took about 20 seconds. 5 Conclusions From the GMRESR methods we have derived a modied set of methods, which preserve the optimality of the outer method in the inner iteration. This optimality is lost in the inner iteration of GMRESR since it essentially uses `restarted' (inner) iterations which cannot take advantage of the `convergence history'. Therefore, GMRESR(m) may lose the superlinear convergence behaviour of (full) GMRES, due to the stagnation or slow convergence of the inner, restarted GMRES(m), which may occur at each inner iteration. In contrast, the GCRO variants exploit the `convergence history' to generate a search space in the inner iteration, that does not interfere with the previous minimization of the error over the space of outer search vectors. If then GMRES(m) is used as the inner method (! GCRO(m)), we do a global minimization of the error over both the inner search space and the outer search space. Where the set of outer search vectors is a sample of the entire, previously searched Krylov subspace. From this point ofviewwemaysay, that GCRO(m) is a semi-full GMRES. This probably accounts for the smooth convergence, the preservation of superlinear convergence (if GMRES does) and the absence of stagnation, which may occur in the inner method of GMRESR. In many practical problems, the convergence behaviour in iterations of GCRO(m) is almost the same as that of GMRES. Apparently the subset of Krylov subspace vectors, that are maintained, approximates the entire Krylov subspace, that has been generated, suciently well. Because GCRO(m) takes much less innerproducts and needs less storage for the same number of iterations than GMRES, it is better suited for implementation on large, distributed memory, parallel computers. Although there is the possibility of breakdown in the inner method for GCRO, this seems to occur rarely as is indicated by theorem 5 indeed it has never happened in any of our experiments. Moreover, in case the generation of the Krylov subspace does breakdown, we have suggested two ways to continue. 21

GMRESR: A family of nested GMRES methods

GMRESR: A family of nested GMRES methods GMRESR: A family of nested GMRES methods Report 91-80 H.A. van der Vorst C. Vui Technische Universiteit Delft Delft University of Technology Faculteit der Technische Wisunde en Informatica Faculty of Technical

More information

Further experiences with GMRESR

Further experiences with GMRESR Further experiences with GMRESR Report 92-2 C. Vui Technische Universiteit Delft Delft University of Technology Faculteit der Technische Wisunde en Informatica Faculty of Technical Mathematics and Informatics

More information

Iterative Methods for Linear Systems of Equations

Iterative Methods for Linear Systems of Equations Iterative Methods for Linear Systems of Equations Projection methods (3) ITMAN PhD-course DTU 20-10-08 till 24-10-08 Martin van Gijzen 1 Delft University of Technology Overview day 4 Bi-Lanczos method

More information

The solution of the discretized incompressible Navier-Stokes equations with iterative methods

The solution of the discretized incompressible Navier-Stokes equations with iterative methods The solution of the discretized incompressible Navier-Stokes equations with iterative methods Report 93-54 C. Vuik Technische Universiteit Delft Delft University of Technology Faculteit der Technische

More information

Preconditioned Conjugate Gradient-Like Methods for. Nonsymmetric Linear Systems 1. Ulrike Meier Yang 2. July 19, 1994

Preconditioned Conjugate Gradient-Like Methods for. Nonsymmetric Linear Systems 1. Ulrike Meier Yang 2. July 19, 1994 Preconditioned Conjugate Gradient-Like Methods for Nonsymmetric Linear Systems Ulrike Meier Yang 2 July 9, 994 This research was supported by the U.S. Department of Energy under Grant No. DE-FG2-85ER25.

More information

Alternative correction equations in the Jacobi-Davidson method. Mathematical Institute. Menno Genseberger and Gerard L. G.

Alternative correction equations in the Jacobi-Davidson method. Mathematical Institute. Menno Genseberger and Gerard L. G. Universiteit-Utrecht * Mathematical Institute Alternative correction equations in the Jacobi-Davidson method by Menno Genseberger and Gerard L. G. Sleijpen Preprint No. 173 June 1998, revised: March 1999

More information

Institute for Advanced Computer Studies. Department of Computer Science. Iterative methods for solving Ax = b. GMRES/FOM versus QMR/BiCG

Institute for Advanced Computer Studies. Department of Computer Science. Iterative methods for solving Ax = b. GMRES/FOM versus QMR/BiCG University of Maryland Institute for Advanced Computer Studies Department of Computer Science College Park TR{96{2 TR{3587 Iterative methods for solving Ax = b GMRES/FOM versus QMR/BiCG Jane K. Cullum

More information

Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method

Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method Leslie Foster 11-5-2012 We will discuss the FOM (full orthogonalization method), CG,

More information

(Technical Report 832, Mathematical Institute, University of Utrecht, October 1993) E. de Sturler. Delft University of Technology.

(Technical Report 832, Mathematical Institute, University of Utrecht, October 1993) E. de Sturler. Delft University of Technology. Reducing the eect of global communication in GMRES(m) and CG on Parallel Distributed Memory Computers (Technical Report 83, Mathematical Institute, University of Utrecht, October 1993) E. de Sturler Faculty

More information

Incomplete Block LU Preconditioners on Slightly Overlapping. E. de Sturler. Delft University of Technology. Abstract

Incomplete Block LU Preconditioners on Slightly Overlapping. E. de Sturler. Delft University of Technology. Abstract Incomplete Block LU Preconditioners on Slightly Overlapping Subdomains for a Massively Parallel Computer E. de Sturler Faculty of Technical Mathematics and Informatics Delft University of Technology Mekelweg

More information

The rate of convergence of the GMRES method

The rate of convergence of the GMRES method The rate of convergence of the GMRES method Report 90-77 C. Vuik Technische Universiteit Delft Delft University of Technology Faculteit der Technische Wiskunde en Informatica Faculty of Technical Mathematics

More information

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A. AMSC/CMSC 661 Scientific Computing II Spring 2005 Solution of Sparse Linear Systems Part 2: Iterative methods Dianne P. O Leary c 2005 Solving Sparse Linear Systems: Iterative methods The plan: Iterative

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT 14-1 Nested Krylov methods for shifted linear systems M. Baumann and M. B. van Gizen ISSN 1389-652 Reports of the Delft Institute of Applied Mathematics Delft 214

More information

FEM and sparse linear system solving

FEM and sparse linear system solving FEM & sparse linear system solving, Lecture 9, Nov 19, 2017 1/36 Lecture 9, Nov 17, 2017: Krylov space methods http://people.inf.ethz.ch/arbenz/fem17 Peter Arbenz Computer Science Department, ETH Zürich

More information

A JACOBI-DAVIDSON ITERATION METHOD FOR LINEAR EIGENVALUE PROBLEMS. GERARD L.G. SLEIJPEN y AND HENK A. VAN DER VORST y

A JACOBI-DAVIDSON ITERATION METHOD FOR LINEAR EIGENVALUE PROBLEMS. GERARD L.G. SLEIJPEN y AND HENK A. VAN DER VORST y A JACOBI-DAVIDSON ITERATION METHOD FOR LINEAR EIGENVALUE PROBLEMS GERARD L.G. SLEIJPEN y AND HENK A. VAN DER VORST y Abstract. In this paper we propose a new method for the iterative computation of a few

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT 16-02 The Induced Dimension Reduction method applied to convection-diffusion-reaction problems R. Astudillo and M. B. van Gijzen ISSN 1389-6520 Reports of the Delft

More information

Alternative correction equations in the Jacobi-Davidson method

Alternative correction equations in the Jacobi-Davidson method Chapter 2 Alternative correction equations in the Jacobi-Davidson method Menno Genseberger and Gerard Sleijpen Abstract The correction equation in the Jacobi-Davidson method is effective in a subspace

More information

Iterative Methods for Sparse Linear Systems

Iterative Methods for Sparse Linear Systems Iterative Methods for Sparse Linear Systems Luca Bergamaschi e-mail: berga@dmsa.unipd.it - http://www.dmsa.unipd.it/ berga Department of Mathematical Methods and Models for Scientific Applications University

More information

Iterative Linear Solvers and Jacobian-free Newton-Krylov Methods

Iterative Linear Solvers and Jacobian-free Newton-Krylov Methods Eric de Sturler Iterative Linear Solvers and Jacobian-free Newton-Krylov Methods Eric de Sturler Departent of Matheatics, Virginia Tech www.ath.vt.edu/people/sturler/index.htl sturler@vt.edu Efficient

More information

IDR(s) Master s thesis Goushani Kisoensingh. Supervisor: Gerard L.G. Sleijpen Department of Mathematics Universiteit Utrecht

IDR(s) Master s thesis Goushani Kisoensingh. Supervisor: Gerard L.G. Sleijpen Department of Mathematics Universiteit Utrecht IDR(s) Master s thesis Goushani Kisoensingh Supervisor: Gerard L.G. Sleijpen Department of Mathematics Universiteit Utrecht Contents 1 Introduction 2 2 The background of Bi-CGSTAB 3 3 IDR(s) 4 3.1 IDR.............................................

More information

Iterative methods for Linear System

Iterative methods for Linear System Iterative methods for Linear System JASS 2009 Student: Rishi Patil Advisor: Prof. Thomas Huckle Outline Basics: Matrices and their properties Eigenvalues, Condition Number Iterative Methods Direct and

More information

Largest Bratu solution, lambda=4

Largest Bratu solution, lambda=4 Largest Bratu solution, lambda=4 * Universiteit Utrecht 5 4 3 2 Department of Mathematics 1 0 30 25 20 15 10 5 5 10 15 20 25 30 Accelerated Inexact Newton Schemes for Large Systems of Nonlinear Equations

More information

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009) Iterative methods for Linear System of Equations Joint Advanced Student School (JASS-2009) Course #2: Numerical Simulation - from Models to Software Introduction In numerical simulation, Partial Differential

More information

Iterative Methods and Multigrid

Iterative Methods and Multigrid Iterative Methods and Multigrid Part 3: Preconditioning 2 Eric de Sturler Preconditioning The general idea behind preconditioning is that convergence of some method for the linear system Ax = b can be

More information

-.- Bi-CG... GMRES(25) --- Bi-CGSTAB BiCGstab(2)

-.- Bi-CG... GMRES(25) --- Bi-CGSTAB BiCGstab(2) .... Advection dominated problem -.- Bi-CG... GMRES(25) --- Bi-CGSTAB BiCGstab(2) * Universiteit Utrecht -2 log1 of residual norm -4-6 -8 Department of Mathematics - GMRES(25) 2 4 6 8 1 Hybrid Bi-Conjugate

More information

M.A. Botchev. September 5, 2014

M.A. Botchev. September 5, 2014 Rome-Moscow school of Matrix Methods and Applied Linear Algebra 2014 A short introduction to Krylov subspaces for linear systems, matrix functions and inexact Newton methods. Plan and exercises. M.A. Botchev

More information

SOR as a Preconditioner. A Dissertation. Presented to. University of Virginia. In Partial Fulllment. of the Requirements for the Degree

SOR as a Preconditioner. A Dissertation. Presented to. University of Virginia. In Partial Fulllment. of the Requirements for the Degree SOR as a Preconditioner A Dissertation Presented to The Faculty of the School of Engineering and Applied Science University of Virginia In Partial Fulllment of the Reuirements for the Degree Doctor of

More information

IDR(s) A family of simple and fast algorithms for solving large nonsymmetric systems of linear equations

IDR(s) A family of simple and fast algorithms for solving large nonsymmetric systems of linear equations IDR(s) A family of simple and fast algorithms for solving large nonsymmetric systems of linear equations Harrachov 2007 Peter Sonneveld en Martin van Gijzen August 24, 2007 1 Delft University of Technology

More information

Linear Solvers. Andrew Hazel

Linear Solvers. Andrew Hazel Linear Solvers Andrew Hazel Introduction Thus far we have talked about the formulation and discretisation of physical problems...... and stopped when we got to a discrete linear system of equations. Introduction

More information

Algorithms that use the Arnoldi Basis

Algorithms that use the Arnoldi Basis AMSC 600 /CMSC 760 Advanced Linear Numerical Analysis Fall 2007 Arnoldi Methods Dianne P. O Leary c 2006, 2007 Algorithms that use the Arnoldi Basis Reference: Chapter 6 of Saad The Arnoldi Basis How to

More information

On the influence of eigenvalues on Bi-CG residual norms

On the influence of eigenvalues on Bi-CG residual norms On the influence of eigenvalues on Bi-CG residual norms Jurjen Duintjer Tebbens Institute of Computer Science Academy of Sciences of the Czech Republic duintjertebbens@cs.cas.cz Gérard Meurant 30, rue

More information

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU Preconditioning Techniques for Solving Large Sparse Linear Systems Arnold Reusken Institut für Geometrie und Praktische Mathematik RWTH-Aachen OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative

More information

ITERATIVE METHODS BASED ON KRYLOV SUBSPACES

ITERATIVE METHODS BASED ON KRYLOV SUBSPACES ITERATIVE METHODS BASED ON KRYLOV SUBSPACES LONG CHEN We shall present iterative methods for solving linear algebraic equation Au = b based on Krylov subspaces We derive conjugate gradient (CG) method

More information

6.4 Krylov Subspaces and Conjugate Gradients

6.4 Krylov Subspaces and Conjugate Gradients 6.4 Krylov Subspaces and Conjugate Gradients Our original equation is Ax = b. The preconditioned equation is P Ax = P b. When we write P, we never intend that an inverse will be explicitly computed. P

More information

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) Contents 1 Vector Spaces 1 1.1 The Formal Denition of a Vector Space.................................. 1 1.2 Subspaces...................................................

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 18 Outline

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT 06-05 Solution of the incompressible Navier Stokes equations with preconditioned Krylov subspace methods M. ur Rehman, C. Vuik G. Segal ISSN 1389-6520 Reports of the

More information

Relaxation strategies for nested Krylov methods

Relaxation strategies for nested Krylov methods Journal of Computational and Applied Mathematics 177 (2005) 347 365 www.elsevier.com/locate/cam Relaxation strategies for nested Krylov methods Jasper van den Eshof a, Gerard L.G. Sleijpen b,, Martin B.

More information

Domain decomposition on different levels of the Jacobi-Davidson method

Domain decomposition on different levels of the Jacobi-Davidson method hapter 5 Domain decomposition on different levels of the Jacobi-Davidson method Abstract Most computational work of Jacobi-Davidson [46], an iterative method suitable for computing solutions of large dimensional

More information

Recycling Bi-Lanczos Algorithms: BiCG, CGS, and BiCGSTAB

Recycling Bi-Lanczos Algorithms: BiCG, CGS, and BiCGSTAB Recycling Bi-Lanczos Algorithms: BiCG, CGS, and BiCGSTAB Kapil Ahuja Thesis submitted to the Faculty of the Virginia Polytechnic Institute and State University in partial fulfillment of the requirements

More information

Computational Linear Algebra

Computational Linear Algebra Computational Linear Algebra PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2018/19 Part 4: Iterative Methods PD

More information

Finite-choice algorithm optimization in Conjugate Gradients

Finite-choice algorithm optimization in Conjugate Gradients Finite-choice algorithm optimization in Conjugate Gradients Jack Dongarra and Victor Eijkhout January 2003 Abstract We present computational aspects of mathematically equivalent implementations of the

More information

Fast iterative solvers

Fast iterative solvers Utrecht, 15 november 2017 Fast iterative solvers Gerard Sleijpen Department of Mathematics http://www.staff.science.uu.nl/ sleij101/ Review. To simplify, take x 0 = 0, assume b 2 = 1. Solving Ax = b for

More information

Solving Large Nonlinear Sparse Systems

Solving Large Nonlinear Sparse Systems Solving Large Nonlinear Sparse Systems Fred W. Wubs and Jonas Thies Computational Mechanics & Numerical Mathematics University of Groningen, the Netherlands f.w.wubs@rug.nl Centre for Interdisciplinary

More information

A MULTIGRID ALGORITHM FOR. Richard E. Ewing and Jian Shen. Institute for Scientic Computation. Texas A&M University. College Station, Texas SUMMARY

A MULTIGRID ALGORITHM FOR. Richard E. Ewing and Jian Shen. Institute for Scientic Computation. Texas A&M University. College Station, Texas SUMMARY A MULTIGRID ALGORITHM FOR THE CELL-CENTERED FINITE DIFFERENCE SCHEME Richard E. Ewing and Jian Shen Institute for Scientic Computation Texas A&M University College Station, Texas SUMMARY In this article,

More information

Stabilization and Acceleration of Algebraic Multigrid Method

Stabilization and Acceleration of Algebraic Multigrid Method Stabilization and Acceleration of Algebraic Multigrid Method Recursive Projection Algorithm A. Jemcov J.P. Maruszewski Fluent Inc. October 24, 2006 Outline 1 Need for Algorithm Stabilization and Acceleration

More information

Chapter 7. Iterative methods for large sparse linear systems. 7.1 Sparse matrix algebra. Large sparse matrices

Chapter 7. Iterative methods for large sparse linear systems. 7.1 Sparse matrix algebra. Large sparse matrices Chapter 7 Iterative methods for large sparse linear systems In this chapter we revisit the problem of solving linear systems of equations, but now in the context of large sparse systems. The price to pay

More information

Computational Linear Algebra

Computational Linear Algebra Computational Linear Algebra PD Dr. rer. nat. habil. Ralf Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2017/18 Part 3: Iterative Methods PD

More information

Vector Space Basics. 1 Abstract Vector Spaces. 1. (commutativity of vector addition) u + v = v + u. 2. (associativity of vector addition)

Vector Space Basics. 1 Abstract Vector Spaces. 1. (commutativity of vector addition) u + v = v + u. 2. (associativity of vector addition) Vector Space Basics (Remark: these notes are highly formal and may be a useful reference to some students however I am also posting Ray Heitmann's notes to Canvas for students interested in a direct computational

More information

A DISSERTATION. Extensions of the Conjugate Residual Method. by Tomohiro Sogabe. Presented to

A DISSERTATION. Extensions of the Conjugate Residual Method. by Tomohiro Sogabe. Presented to A DISSERTATION Extensions of the Conjugate Residual Method ( ) by Tomohiro Sogabe Presented to Department of Applied Physics, The University of Tokyo Contents 1 Introduction 1 2 Krylov subspace methods

More information

IDR(s) as a projection method

IDR(s) as a projection method Delft University of Technology Faculty of Electrical Engineering, Mathematics and Computer Science Delft Institute of Applied Mathematics IDR(s) as a projection method A thesis submitted to the Delft Institute

More information

An advanced ILU preconditioner for the incompressible Navier-Stokes equations

An advanced ILU preconditioner for the incompressible Navier-Stokes equations An advanced ILU preconditioner for the incompressible Navier-Stokes equations M. ur Rehman C. Vuik A. Segal Delft Institute of Applied Mathematics, TU delft The Netherlands Computational Methods with Applications,

More information

Solving Sparse Linear Systems: Iterative methods

Solving Sparse Linear Systems: Iterative methods Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/sccs Lecture Notes for Unit VII Sparse Matrix Computations Part 2: Iterative Methods Dianne P. O Leary c 2008,2010

More information

Solving Sparse Linear Systems: Iterative methods

Solving Sparse Linear Systems: Iterative methods Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/sccswebpage Lecture Notes for Unit VII Sparse Matrix Computations Part 2: Iterative Methods Dianne P. O Leary

More information

University of Maryland at College Park. limited amount of computer memory, thereby allowing problems with a very large number

University of Maryland at College Park. limited amount of computer memory, thereby allowing problems with a very large number Limited-Memory Matrix Methods with Applications 1 Tamara Gibson Kolda 2 Applied Mathematics Program University of Maryland at College Park Abstract. The focus of this dissertation is on matrix decompositions

More information

CG Type Algorithm for Indefinite Linear Systems 1. Conjugate Gradient Type Methods for Solving Symmetric, Indenite Linear Systems.

CG Type Algorithm for Indefinite Linear Systems 1. Conjugate Gradient Type Methods for Solving Symmetric, Indenite Linear Systems. CG Type Algorithm for Indefinite Linear Systems 1 Conjugate Gradient Type Methods for Solving Symmetric, Indenite Linear Systems Jan Modersitzki ) Institute of Applied Mathematics Bundesstrae 55 20146

More information

Preconditioned inverse iteration and shift-invert Arnoldi method

Preconditioned inverse iteration and shift-invert Arnoldi method Preconditioned inverse iteration and shift-invert Arnoldi method Melina Freitag Department of Mathematical Sciences University of Bath CSC Seminar Max-Planck-Institute for Dynamics of Complex Technical

More information

Nested Krylov methods for shifted linear systems

Nested Krylov methods for shifted linear systems Nested Krylov methods for shifted linear systems M. Baumann, and M. B. van Gizen Email: M.M.Baumann@tudelft.nl Delft Institute of Applied Mathematics Delft University of Technology Delft, The Netherlands

More information

The Conjugate Gradient Method

The Conjugate Gradient Method The Conjugate Gradient Method Classical Iterations We have a problem, We assume that the matrix comes from a discretization of a PDE. The best and most popular model problem is, The matrix will be as large

More information

Lab 1: Iterative Methods for Solving Linear Systems

Lab 1: Iterative Methods for Solving Linear Systems Lab 1: Iterative Methods for Solving Linear Systems January 22, 2017 Introduction Many real world applications require the solution to very large and sparse linear systems where direct methods such as

More information

Solving Ax = b, an overview. Program

Solving Ax = b, an overview. Program Numerical Linear Algebra Improving iterative solvers: preconditioning, deflation, numerical software and parallelisation Gerard Sleijpen and Martin van Gijzen November 29, 27 Solving Ax = b, an overview

More information

Chapter 7 Iterative Techniques in Matrix Algebra

Chapter 7 Iterative Techniques in Matrix Algebra Chapter 7 Iterative Techniques in Matrix Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128B Numerical Analysis Vector Norms Definition

More information

1e N

1e N Spectral schemes on triangular elements by Wilhelm Heinrichs and Birgit I. Loch Abstract The Poisson problem with homogeneous Dirichlet boundary conditions is considered on a triangle. The mapping between

More information

Solving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners

Solving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners Solving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners Eugene Vecharynski 1 Andrew Knyazev 2 1 Department of Computer Science and Engineering University of Minnesota 2 Department

More information

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit V: Eigenvalue Problems Lecturer: Dr. David Knezevic Unit V: Eigenvalue Problems Chapter V.4: Krylov Subspace Methods 2 / 51 Krylov Subspace Methods In this chapter we give

More information

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294) Conjugate gradient method Descent method Hestenes, Stiefel 1952 For A N N SPD In exact arithmetic, solves in N steps In real arithmetic No guaranteed stopping Often converges in many fewer than N steps

More information

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems Topics The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems What about non-spd systems? Methods requiring small history Methods requiring large history Summary of solvers 1 / 52 Conjugate

More information

1. Introduction. Krylov subspace methods are used extensively for the iterative solution of linear systems of equations Ax = b.

1. Introduction. Krylov subspace methods are used extensively for the iterative solution of linear systems of equations Ax = b. SIAM J. SCI. COMPUT. Vol. 31, No. 2, pp. 1035 1062 c 2008 Society for Industrial and Applied Mathematics IDR(s): A FAMILY OF SIMPLE AND FAST ALGORITHMS FOR SOLVING LARGE NONSYMMETRIC SYSTEMS OF LINEAR

More information

t x 0.25

t x 0.25 Journal of ELECTRICAL ENGINEERING, VOL. 52, NO. /s, 2, 48{52 COMPARISON OF BROYDEN AND NEWTON METHODS FOR SOLVING NONLINEAR PARABOLIC EQUATIONS Ivan Cimrak If the time discretization of a nonlinear parabolic

More information

AMS Mathematics Subject Classification : 65F10,65F50. Key words and phrases: ILUS factorization, preconditioning, Schur complement, 1.

AMS Mathematics Subject Classification : 65F10,65F50. Key words and phrases: ILUS factorization, preconditioning, Schur complement, 1. J. Appl. Math. & Computing Vol. 15(2004), No. 1, pp. 299-312 BILUS: A BLOCK VERSION OF ILUS FACTORIZATION DAVOD KHOJASTEH SALKUYEH AND FAEZEH TOUTOUNIAN Abstract. ILUS factorization has many desirable

More information

problem Au = u by constructing an orthonormal basis V k = [v 1 ; : : : ; v k ], at each k th iteration step, and then nding an approximation for the e

problem Au = u by constructing an orthonormal basis V k = [v 1 ; : : : ; v k ], at each k th iteration step, and then nding an approximation for the e A Parallel Solver for Extreme Eigenpairs 1 Leonardo Borges and Suely Oliveira 2 Computer Science Department, Texas A&M University, College Station, TX 77843-3112, USA. Abstract. In this paper a parallel

More information

Peter Deuhard. for Symmetric Indenite Linear Systems

Peter Deuhard. for Symmetric Indenite Linear Systems Peter Deuhard A Study of Lanczos{Type Iterations for Symmetric Indenite Linear Systems Preprint SC 93{6 (March 993) Contents 0. Introduction. Basic Recursive Structure 2. Algorithm Design Principles 7

More information

Gradient Method Based on Roots of A

Gradient Method Based on Roots of A Journal of Scientific Computing, Vol. 15, No. 4, 2000 Solving Ax Using a Modified Conjugate Gradient Method Based on Roots of A Paul F. Fischer 1 and Sigal Gottlieb 2 Received January 23, 2001; accepted

More information

In order to solve the linear system KL M N when K is nonsymmetric, we can solve the equivalent system

In order to solve the linear system KL M N when K is nonsymmetric, we can solve the equivalent system !"#$% "&!#' (%)!#" *# %)%(! #! %)!#" +, %"!"#$ %*&%! $#&*! *# %)%! -. -/ 0 -. 12 "**3! * $!#%+,!2!#% 44" #% &#33 # 4"!#" "%! "5"#!!#6 -. - #% " 7% "3#!#3! - + 87&2! * $!#% 44" ) 3( $! # % %#!!#%+ 9332!

More information

Accelerated Solvers for CFD

Accelerated Solvers for CFD Accelerated Solvers for CFD Co-Design of Hardware/Software for Predicting MAV Aerodynamics Eric de Sturler, Virginia Tech Mathematics Email: sturler@vt.edu Web: http://www.math.vt.edu/people/sturler Co-design

More information

Parallel Numerics, WT 2016/ Iterative Methods for Sparse Linear Systems of Equations. page 1 of 1

Parallel Numerics, WT 2016/ Iterative Methods for Sparse Linear Systems of Equations. page 1 of 1 Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations page 1 of 1 Contents 1 Introduction 1.1 Computer Science Aspects 1.2 Numerical Problems 1.3 Graphs 1.4 Loop Manipulations

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 9 Minimizing Residual CG

More information

Lanczos Bidiagonalization with Subspace Augmentation for Discrete Inverse Problems

Lanczos Bidiagonalization with Subspace Augmentation for Discrete Inverse Problems Lanczos Bidiagonalization with Subspace Augmentation for Discrete Inverse Problems Per Christian Hansen Technical University of Denmark Ongoing work with Kuniyoshi Abe, Gifu Dedicated to Dianne P. O Leary

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT 18-05 Efficient and robust Schur complement approximations in the augmented Lagrangian preconditioner for high Reynolds number laminar flows X. He and C. Vuik ISSN

More information

Reduced Synchronization Overhead on. December 3, Abstract. The standard formulation of the conjugate gradient algorithm involves

Reduced Synchronization Overhead on. December 3, Abstract. The standard formulation of the conjugate gradient algorithm involves Lapack Working Note 56 Conjugate Gradient Algorithms with Reduced Synchronization Overhead on Distributed Memory Multiprocessors E. F. D'Azevedo y, V.L. Eijkhout z, C. H. Romine y December 3, 1999 Abstract

More information

CONVERGENCE BOUNDS FOR PRECONDITIONED GMRES USING ELEMENT-BY-ELEMENT ESTIMATES OF THE FIELD OF VALUES

CONVERGENCE BOUNDS FOR PRECONDITIONED GMRES USING ELEMENT-BY-ELEMENT ESTIMATES OF THE FIELD OF VALUES European Conference on Computational Fluid Dynamics ECCOMAS CFD 2006 P. Wesseling, E. Oñate and J. Périaux (Eds) c TU Delft, The Netherlands, 2006 CONVERGENCE BOUNDS FOR PRECONDITIONED GMRES USING ELEMENT-BY-ELEMENT

More information

A Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation

A Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation A Domain Decomposition Based Jacobi-Davidson Algorithm for Quantum Dot Simulation Tao Zhao 1, Feng-Nan Hwang 2 and Xiao-Chuan Cai 3 Abstract In this paper, we develop an overlapping domain decomposition

More information

Preface to the Second Edition. Preface to the First Edition

Preface to the Second Edition. Preface to the First Edition n page v Preface to the Second Edition Preface to the First Edition xiii xvii 1 Background in Linear Algebra 1 1.1 Matrices................................. 1 1.2 Square Matrices and Eigenvalues....................

More information

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA 1 SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA 2 OUTLINE Sparse matrix storage format Basic factorization

More information

The parallel computation of the smallest eigenpair of an. acoustic problem with damping. Martin B. van Gijzen and Femke A. Raeven.

The parallel computation of the smallest eigenpair of an. acoustic problem with damping. Martin B. van Gijzen and Femke A. Raeven. The parallel computation of the smallest eigenpair of an acoustic problem with damping. Martin B. van Gijzen and Femke A. Raeven Abstract Acoustic problems with damping may give rise to large quadratic

More information

Krylov Space Solvers

Krylov Space Solvers Seminar for Applied Mathematics ETH Zurich International Symposium on Frontiers of Computational Science Nagoya, 12/13 Dec. 2005 Sparse Matrices Large sparse linear systems of equations or large sparse

More information

Preconditioners for the incompressible Navier Stokes equations

Preconditioners for the incompressible Navier Stokes equations Preconditioners for the incompressible Navier Stokes equations C. Vuik M. ur Rehman A. Segal Delft Institute of Applied Mathematics, TU Delft, The Netherlands SIAM Conference on Computational Science and

More information

STABILITY OF INVARIANT SUBSPACES OF COMMUTING MATRICES We obtain some further results for pairs of commuting matrices. We show that a pair of commutin

STABILITY OF INVARIANT SUBSPACES OF COMMUTING MATRICES We obtain some further results for pairs of commuting matrices. We show that a pair of commutin On the stability of invariant subspaces of commuting matrices Tomaz Kosir and Bor Plestenjak September 18, 001 Abstract We study the stability of (joint) invariant subspaces of a nite set of commuting

More information

The model reduction algorithm proposed is based on an iterative two-step LMI scheme. The convergence of the algorithm is not analyzed but examples sho

The model reduction algorithm proposed is based on an iterative two-step LMI scheme. The convergence of the algorithm is not analyzed but examples sho Model Reduction from an H 1 /LMI perspective A. Helmersson Department of Electrical Engineering Linkoping University S-581 8 Linkoping, Sweden tel: +6 1 816 fax: +6 1 86 email: andersh@isy.liu.se September

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT 11-14 On the convergence of inexact Newton methods R. Idema, D.J.P. Lahaye, and C. Vuik ISSN 1389-6520 Reports of the Department of Applied Mathematical Analysis Delft

More information

The iterative convex minorant algorithm for nonparametric estimation

The iterative convex minorant algorithm for nonparametric estimation The iterative convex minorant algorithm for nonparametric estimation Report 95-05 Geurt Jongbloed Technische Universiteit Delft Delft University of Technology Faculteit der Technische Wiskunde en Informatica

More information

Residual iterative schemes for largescale linear systems

Residual iterative schemes for largescale linear systems Universidad Central de Venezuela Facultad de Ciencias Escuela de Computación Lecturas en Ciencias de la Computación ISSN 1316-6239 Residual iterative schemes for largescale linear systems William La Cruz

More information

9.1 Preconditioned Krylov Subspace Methods

9.1 Preconditioned Krylov Subspace Methods Chapter 9 PRECONDITIONING 9.1 Preconditioned Krylov Subspace Methods 9.2 Preconditioned Conjugate Gradient 9.3 Preconditioned Generalized Minimal Residual 9.4 Relaxation Method Preconditioners 9.5 Incomplete

More information

Krylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17

Krylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17 Krylov Space Methods Nonstationary sounds good Radu Trîmbiţaş Babeş-Bolyai University Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17 Introduction These methods are used both to solve

More information

The Conjugate Gradient Method

The Conjugate Gradient Method The Conjugate Gradient Method The minimization problem We are given a symmetric positive definite matrix R n n and a right hand side vector b R n We want to solve the linear system Find u R n such that

More information

Multigrid absolute value preconditioning

Multigrid absolute value preconditioning Multigrid absolute value preconditioning Eugene Vecharynski 1 Andrew Knyazev 2 (speaker) 1 Department of Computer Science and Engineering University of Minnesota 2 Department of Mathematical and Statistical

More information

Mathematics Research Report No. MRR 003{96, HIGH RESOLUTION POTENTIAL FLOW METHODS IN OIL EXPLORATION Stephen Roberts 1 and Stephan Matthai 2 3rd Febr

Mathematics Research Report No. MRR 003{96, HIGH RESOLUTION POTENTIAL FLOW METHODS IN OIL EXPLORATION Stephen Roberts 1 and Stephan Matthai 2 3rd Febr HIGH RESOLUTION POTENTIAL FLOW METHODS IN OIL EXPLORATION Stephen Roberts and Stephan Matthai Mathematics Research Report No. MRR 003{96, Mathematics Research Report No. MRR 003{96, HIGH RESOLUTION POTENTIAL

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT 10-16 An elegant IDR(s) variant that efficiently exploits bi-orthogonality properties Martin B. van Gijzen and Peter Sonneveld ISSN 1389-6520 Reports of the Department

More information

Universiteit-Utrecht. Department. of Mathematics. Jacobi-Davidson algorithms for various. eigenproblems. - A working document -

Universiteit-Utrecht. Department. of Mathematics. Jacobi-Davidson algorithms for various. eigenproblems. - A working document - Universiteit-Utrecht * Department of Mathematics Jacobi-Davidson algorithms for various eigenproblems - A working document - by Gerard L.G. Sleipen, Henk A. Van der Vorst, and Zhaoun Bai Preprint nr. 1114

More information

Arnoldi Methods in SLEPc

Arnoldi Methods in SLEPc Scalable Library for Eigenvalue Problem Computations SLEPc Technical Report STR-4 Available at http://slepc.upv.es Arnoldi Methods in SLEPc V. Hernández J. E. Román A. Tomás V. Vidal Last update: October,

More information