4.6 Iterative Solvers for Linear Systems

Size: px
Start display at page:

Download "4.6 Iterative Solvers for Linear Systems"

Transcription

1 4.6 Iterative Solvers for Linear Systems Why use iterative methods? Virtually all direct methods for solving Ax = b require O(n 3 ) floating point operations. In practical applications the matrix A often has a very special structure, and possibly contains many zero entries (is sparse). It is especially in these cases where iterative methods pay off. As an example, recall the discretized Poisson problem from the very first class meeting. Certain iterative methods were mentioned that can solve the resulting linear system in O(n) operations. We now study some of these methods in detail Classical Iterative Solvers We begin with a simple example. Example: Consider the (iterative) solution of the linear system (see also the Maple worksheet 577 Iterative Solvers.mws) [ ] [ ] [ ] 2 1 x1 6 = x 2 The simplest approach is to use a fixedpoint iteration scheme. Thus, we solve the first equation for x 1, and the second equation for x 2, and start the iteration with some initial guess x (0) = [x (0) 1 x(0) 2 ]T. This results in x (k) 1 = x (k) 2 = ( 6 x (k 1) ) 2 /2 ( 6 x (k 1) ) 1 /2, for k = 1, 2, 3,.... The method just described is known as Jacobi iteration, and a general algorithm is given by Algorithm (Jacobi iteration) Input A, b, x, n, M for k = 1 to M do for i = 1 to n do u i = b i a ij x j /a ii for i = 1 to n do Output x x i = u i j i 1

2 An obvious improvement of the Jacobi iteration is to use the most recent estimate for x i as soon as it becomes available. For the example above this results in x (k) ( 1 = 6 x (k 1) ) 2 /2 x (k) ( 2 = 6 x (k) ) 1 /2, for k = 1, 2, 3,.... This method is known as Gauss-Seidel iteration, and the general algorithm is Algorithm (Gauss-Seidel iteration) Input A, b, x, n, M for k = 1 to M do for i = 1 to n do x i = b i a ij x j /a ii Output x Remarks: j i 1. Both algorithms can be made more efficient by moving the division to a preprocessing step. 2. For a generic problem Gauss-Seidel iteration converges faster than Jacobi iteration. However, there are certain problems for which Jacobi iteration may converge faster. 3. Jacobi iteration has the advantage that it can be implemented on a parallel computer in a straightforward manner. In a more abstract setting we want to consider the solution of the linear system Ax = b by an iterative algorithm of the form x (k) = Gx (k 1) + c, k = 1, 2, 3,..., (14) where x (0) is our initial guess, G is a (constant) iteration matrix, and c is a (constant) vector. All of the classical solvers discussed here are based on a splitting of A of the form A = A Q + Q with splitting matrix Q. Thus, provided Q 1 exists, Ax = b Qx + (A Q)x = b Qx = (Q A)x + b (15) ( ) x = I Q 1 A x + Q 1 b. (16) }{{}}{{} =c =G Our goals in choosing the splitting matrix Q are 2

3 1. Q should be invertible, i.e., we want to be able to solve (15) efficiently. 2. The iteration (14) should converge quickly to a fixed point. The two extreme choices Q = A and Q = I correspond to the original problem Ax = b (so that nothing has been gained), and the so-called Richardson method x (k) = (I A)x (k 1) + b = x (k 1) + r (k 1) (for which no system needs to be solved, but which deping on A may not converge). Therefore, one usually chooses Q somewhere between these two extremes. We can see that the Jacobi iteration as defined above corresponds to the choice Q = diag(a). To this, Q 1 A = 1 a 11 1 a a nn a 11 a a 1n a 21 a a 2n.... =. a n1 a n2... a nn a 11 a a 21 a 22 a 22. a n1 a nn a 1n 11 a 2n a 22 a a a n2 a nn... a nn a nn i.e., the rows of A are scaled by the reciprocals of the diagonal entries. Then we have ( a 1j x j ) /a 11 Q 1 ( b 1 /a 11 a 2j x j ) /a 22 Ax = and Q 1 b 2 /a 22 b =. (., a nj x j ) /a nn b n /a nn so that (using (16) componentwise) x (k) i = x (k 1) i = b i j i a ij x (k 1) j a ij x (k 1) j /a ii, /a ii + b i /a ii which is exactly the formula used in the algorithm. The Gauss-Seidel iteration algorithm can be derived by taking Q as the lower triangle of A, i.e., { Aij, i j Q ij = 0, otherwise. Thus (15) can be written as, a 11 a 21 a a n1 a n2... a nn Qx (k) = (Q A)x (k 1) + b 0 a a 1n x(k) 0. =... an 1,n 0 x (k 1) + b. 3

4 If we solve this for x (k) i, i = 1,..., n, we get x (k) 1 = x (k) 2 =. x (k) i =. x (k) n = a 1j x (k 1) j + b 1 /a 11 j=2 a 2j x (k 1) j + b 2 a 21 x (k) 1 /a 22 j=3 b n This, however, is equivalent to j=i+1 x i n 1 a ij x (k 1) j a nj x (k) /a nn. b i j i 1 + b i a ij x (k) /a ii a ij x j /a ii, j i which is the formula from the algorithm. We now turn to a discussion of the properties required of a general iteration matrix G (or splitting matrix Q) so that the fixed point iteration (14) is guaranteed to converge. Theorem 4.14 If G = I Q 1 A < 1 then (14) converges to a solution of Ax = b for any starting vector x (0). Proof: It is clear that (14) is a fixed point iteration (x = F (x)), and the fixed point is a solution of Ax = b. This follows immediately from the equivalence transformations involving (15) and (16). Now, let e (k) = x (k) x, where x is the fixed point solution. Then Taking norms, we have Proceeding recursively we get e (k) = Gx ( (k 1) + c )(Gx + c) = G x (k 1) x = Ge (k 1). e (k) = Ge (k 1) G e (k 1). e (k) G k e (0), so that e (k) 0 if G < 1. Of course, this ensures x (k) x, i.e., convergence of the method. j 4

5 Corollary 4.15 If A is diagonally dominant, then Jacobi iteration converges for any x (0). Proof: By the definition of diagonal dominance (Definition 4.8 in Sect. 4.3) the entries of A satisfy a ii > a ij, i = 1,..., n. (17) j i We need to show I Q 1 A < 1. Since the statement of Theorem 4.14 holds for any matrix norm, we can take. Earlier we showed that, for Jacobi iteration, a 1 12 a a n 11 a Q 1 21 a A = a n a a n2 a nn... 1 But then a n1 a nn I Q 1 A = max by the diagonal dominance of A (17). ( 1 i n a ij a ii j i = max 1 i n < 1 ) I Q 1 A In order to be able to formulate a necessary and sufficient condition for the convergence of (14) we need to cite a lemma which follows from the Schur factorization Theorem 5.19 in Sect Lemma 4.16 Any n n matrix A is similar to an upper triangular matrix U whose off-diagonal elements are arbitrarily small, i.e., for any ɛ > 0 there exists a nonsingular n n matrix S such that S 1 AS = U = D + T, where D = diag(u) (containing the eigenvalues of A) and T < ɛ. Proof: See textbook p. 214 (proof of Theorem 3). Remark: Lemma 4.16 says that any n n matrix A is almost similar to a diagonal matrix. This is what we saw at work when computing eigenvalues via the QR iteration algorithm in Sect The next ingredient requires that we introduce the spectral radius of A, denoted by ρ(a) = max 1 i n { λ i : λ i an eigenvalue of A}. Theorem 4.17 The spectral radius of A is the largest lower bound for any matrix norm induced by a vector norm, i.e., ρ(a) = inf A. ij 5

6 Proof: First we show ρ(a) inf A. Let (λ, x) be the eigenpair of A for which ρ(a) = λ. Then, for any vector norm, A = sup Au Ax u =1 x = λx = λ = ρ(a). x Taking the infimum over all induced matrix norms we get inf A ρ(a). Next we show that ρ(a) inf A. Applying the maximum norm to the factorization of Lemma 4.16 we have S 1 AS = D + T D + T. Now, Lemma 4.16 ensures that for any ɛ > 0 we have T < ɛ. Moreover, D = diag(λ 1,..., λ n ) so that D = ρ(a). Therefore, S 1 AS ρ(a) + ɛ. From the homework problem 4.6#6 we know that there exists another induced norm such that A = S 1 AS. Therefore, A ρ(a) + ɛ. Finally, we can take the infimum and obtain inf A ρ(a) + ɛ. Since this statement holds for arbitrary ɛ we have inf A ρ(a) and are done. Now we can prove Theorem 4.18 The iteration x (k) = Gx (k 1) + c (cf. (14)) converges to the solution of Ax = b for any initial guess x (0) if and only if ρ(g) < 1. Proof: First we assume ρ(g) < 1 and show convergence. Theorem 4.17 tells us that there exists a matrix norm for which G < 1, and then Theorem 4.14 ensures convergence. To prove the reverse implication we assume ρ(a) > 1 and deduce divergence of the fixed point iteration. Let (λ, u) be an eigenpair of G with λ 1. Then we choose x (0) as x (0) = u + x, where x is the fixed point of (14). Now, using (14) we have x (k) x = G(x (k 1) x). 6

7 By applying this idea recursively we obtain x (k) x = G k (x (0) x) = G k u = λ k u. By the choice of (λ, u) we see that x (k) diverges. A sufficient convergence criterion for Gauss-Seidel iteration can now also be formulated. Theorem 4.19 If A is symmetric and positive definite then Gauss-Seidel iteration converges for any initial vector x (0). Remark: The statement of the theorem can be relaxed. All that is needed is that A is symmetric positive semi-definite with positive diagonal entries, and such that a solution of Ax = b exists. This slightly weaker version ensures that the normal equations can be solved using Gauss-Seidel iteration. Proof of Theorem 4.19: According to Theorem 4.18 we need to show ρ(g) < 1, where, for Gauss-Seidel iteration, we have Here we have decomposed A into G = I Q 1 A = I (L + D) 1 A. (18) A = L + D + }{{}}{{} L T, =Q =A Q where L is the lower triangular part of A without the diagonal. We begin by defining a vector norm which induces a matrix norm x A = x T Ax G A = max x 0 Gx A x A. This allows us to make use of Theorem 4.17, and we can instead show Gx 2 A x 2 A = (Gx)T AGx x T Ax < 1, (19) since then Now, (19) is equivalent to ρ(g) G A < 1. x T (A G T AG)x > 0. (20) Using the definition of G (18) some messy manipulations yield A G T AG = A(L T + D) 1 D(L + D) 1 A = A(L T + D) 1 D D(L + D) 1 A, 7

8 where the positive definiteness of A (d i > 0) permits the last step. Returning to (20), and using the symmetry of A, we have x T (A G T AG)x = x T A(L T + D) 1 D D(L + D) 1 Ax }{{}}{{} =y T =y = y T y 0. In fact, this inequality is strict for x 0 since D(L + D) 1 A has full rank. Rate of Convergence We want to have an estimate for how many iterations are required in the fixed point iteration x (k) = Gx (k 1) + c to reduce the initial error e (0) = x (0) x by a factor of 10 α for some positive exponent α. We know (see the proof of Theorem 4.14) that the fixed point iteration scheme leads to an error estimate of the form Now, using Theorem 4.17 we can conclude e (k) G k e (0). e (k) [ρ(g)] k e (0). Our goal is to reduce the initial error by a factor 10 α. Therefore, we want or Thus e (k) [ρ(g)] k e (0) 10 α e (0) [ρ(g)] k 10 α. k ln (ρ(g)) α ln 10. Since, for a convergent fixed point iteration we have 0 < ρ(g) < 1 we get If we define the rate of convergence k α ln 10 ln (ρ(g)). r = ln (ρ(g)) ln 10 = 1 log 10 (ρ(g)), then we should have k α r to ensure the desired improvement. 8

9 Successive Over Relaxation (SOR) Another iterative solver can be obtained by computing the iterate x (k) as a combination of x (k) and x (k 1), i.e., x (k) (1 ω)x (k 1) + ωx (k), with the relaxation parameter ω chosen appropriately. This is a generalization of the Gauss-Seidel iteration (corresponding to ω = 1). The standard SOR algorithm is as follows (a more general version is discussed in the textbook): Algorithm (SOR iteration) Input A, b, x, ω, n, M for k = 1 to M do for i = 1 to n do i 1 u i = (1 ω)x i + ω b i a ij u j for i = 1 to n do Output x x i = u i j=i+1 a ij x j /a ii Theorem 4.20 If A is symmetric and positive definite and 0 < ω < 2 then SOR iteration converges for any starting value x (0). Proof: omitted Remarks: 1. The optimal value of ω is known only for special cases. For example, if A is tridiagonal, then 2 ω opt = ρ(g), where G is the iteration matrix for the Gauss-Seidel iteration. 2. For (sparse) matrices with special structure the basic iteration algorithms can be implemented much more efficiently. For example, for the Poisson problem discussed at the beginning of the semester Jacobi iteration becomes 9

10 for k = 1 to M do for i = 1 to n 1 do for j = 1 to n 1 do u (k) ij = ( u (k 1) i 1,j + u(k 1) i,j 1 + u(k 1) i+1,j + u(k 1) i,j+1 + f ij n 2 ) /4 Note that even though the system matrix A is of size n 2 n 2 only five entries per row are non-zero and they are hard-coded into the algorithm. Thus, the matrix A is never explicitly and completely formed. In Matlab the i-j double loops can be implemented in one single line of code. 3. A few other classical iterative methods such as symmetric successive over relaxation (SSOR) and Chebyshev iteration are discussed in the textbook. 4. Due to their rather slow convergence classical iterative solvers are no longer very popular as solvers of linear systems. However, they are frequently used as preconditioners for more sophisticated methods (e.g., conjugate gradient or multigrid). 4.7 Steepest Descent and Conjugate Gradient Methods Even though the SOR method may converge faster than Jacobi or Gauss-Seidel iteration, it is difficult to choose a good relaxation parameter ω. Therefore, most of the time another class of iterative solvers is used for linear systems with symmetric positive definite system matrices. These methods are based on Theorem 4.21 If A is symmetric positive definite, then solving Ax = b is equivalent to minimizing the quadratic form q(x) = x, Ax 2 x, b. Remarks: 1. Here we have used the notation x, y = x T y = product in IR n. x i y i for the standard inner i=1 2. The family of optimization based iterative solvers is also known as Krylov subspace iteration in the literature. 10

11 Proof: First we consider only changes of q along the ray x + tv, where t IR, and v 0 a fixed direction. By definition q(x + tv) = x + tv, A(x + tv) 2 x + tv, b = x, Ax + tv, Ax + x, A(tv) + tv, A(tv) 2 x, b 2 tv, b = x, Ax 2 x, b + tv, Ax + x, A(tv) + tv, A(tv) 2 tv, b (21) }{{} =q(x) Now, using the symmetry of A, ( ) T ( T x, A(tv) = tx T (Av) = t v T A T x = t v Ax) T = tv T Ax, where the last equality holds because the inner product is a scalar quantity. Therefore, Inserting (22) into (21) we get x, A(tv) = t v, Ax. (22) q(x + tv) = q(x) + 2t v, Ax + t 2 v, Av 2t v, b = q(x) + 2t v, Ax b + t 2 v, Av. (23) Note that here v, Av > 0 since A is positive definite and v 0. Therefore, as a quadratic function of t (with positive leading coefficient) q(x + v) has a minimum. A necessary condition for the minimum (along the ray x + tv) is d q(x + tv) = 0. dt Thus, we calculate d q(x + tv) = 2 v, Ax b + 2t v, Av. (24) dt Clearly, (24) is zero for ˆt = v, Ax b v, Av = v, b Ax. v, Av The value of the minimum is obtained from (23): ( ) v, b Ax v, b Ax 2 q(x + ˆtv) = q(x) + 2 v, Ax b + v, Av v, Av v, Av v, b Ax 2 v, b Ax 2 = q(x) 2 + v, Av v, Av v, b Ax 2 = q(x). v, Av Now, q(x+ˆtv) < q(x) if and only if v, b Ax 0, i.e., the direction v is not orthogonal to the residual. Thus the value of q can be reduced by moving from x to x + tv along the direction v, where v is some direction not orthogonal to the residual b Ax. To conclude the proof we have two possibilities to consider: 11

12 1. x is a solution of Ax = b, i.e., b Ax = 0. Then q(x) is actually the minimum value, and no choice of v can reduce the value of q (i.e., x solution of Ax = b = x minimizer of q ). 2. x is such that Ax b, then there exists a direction v such that v, b Ax 0 and so q(x) is not a minimum (and thus can be improved which we do until we have reached the minimum and case 1.). So, x a minimizer of q = x a solution of Ax = b. The claim is established. The argument used at the of the proof suggests an iterative algorithm for the solution of Ax = b via minimization of q. Algorithm Input A, b, initial guess x (0), initial search direction v (0) for k = 0, 1,... do Calculate the stepsize t = v(k), b Ax (k) v (k), Av (k) Update the solution x (k+1) = x (k) + tv (k) Pick a new search direction v (k+1) Output x (k+1) Remark: The preceding algorithm is in quotation marks since it is not yet a complete algorithm. We have not yet answered how to choose the search directions v (k+1). The only constraint we have so far is that v, b Ax 0. Method of Steepest Descent From Calculus we know that the gradient of q indicates the direction of greatest change. Therefore, in the so-called method of steepest descent we choose v (k+1) = q(x (k+1) ). In homework problem 4.7#1 you will show that and so we get Algorithm (Steepest descent) Input A, b, initial guess x for k = 0, 1,... do q(x (k+1) ) = 2(Ax (k+1) b), v = b Ax v, v t = v, Av x = x + tv (see the rough algorithm above) Output x 12

13 Remark: This method is not very practical, as it usually converges rather slowly. Since the level curves of q are ellipses (for problems with two unknowns), the search directions t to zig-zag down the surface instead of going straight to the bottom of the valley (especially for elongated ellipses, i.e., κ 2 (A) large). See Figure 4.3 in the textbook. Conjugate Search Directions The main ingredient in deriving another one of the top ten algorithms of the 20th century (the conjugate gradient method) will be the use of so-called conjugate search directions. To this we introduce Definition 4.22 Let A be a symmetric positive definite matrix. Then the inner product is called an A-inner product. x, y A = x, Ay The A-inner product induces a vector norm x A = x, x A = x, Ax. We call a set of vectors {u (1),..., u (n) } A-orthonormal if u (i), u (j) A = δ ij. The main theorem (which will give us the conjugate gradient method) is Theorem 4.23 Let {u (1),..., u (n) } be an A-orthonormal set, and define x (i) = x (i 1) + b Ax (i 1), u (i) u (i), i = 1,..., n, (25) with x (0) an arbitrary starting vector. Then Ax (n) = b. Remarks: 1. Note that Theorem 4.23 says that the iteration (25) yields the solution of Ax = b in n steps, and thus can be considered a direct method. However, in practice this usually does not happen (due to roundoff errors). Historically, this means that after its introduction in 1952 by Hestenes and Stiefel the conjugate gradient method was considered a useless (direct) method, and long forgotten. 2. In the 1970s it was observed that for certain (large, sparse, and well-conditioned) problems the conjugate gradient method converges extremely fast to an accurate approximate solution of Ax = b (much faster than in n steps). And thus, it has become one of the top iterative solvers. In fact, one can show that, for well-conditioned problems, the conjugate gradient method converges in O( κ) iterations. 13

14 Proof of Theorem 4.23: From (25) we see that we can define the stepsize as t i = b Ax (i 1), u (i). (26) Then (25) becomes We multiply this by A and get x (i) = x (i 1) + t i u (i). Therefore, proceeding recursively, Ax (i) = Ax (i 1) + t i Au (i). (27) Ax (n) = Ax (n 1) + t n Au (n) = Ax (n 2) + t n 1 Au (n 1) + t n Au (n). = Ax (0) + t 1 Au (1) + t 2 Au (2) t n Au (n). Now we subtract b and take the inner product with u (i) (which allows us to use the A-orthonormality of {u (1),..., u (n) }) to arrive at If we can show that Ax (n) b, u (i) = Ax (0) b, u (i) + = Ax (0) b, u (i) + t i. t j Au (j), u (i) }{{} =δ ij t i = Ax (0) b, u (i), (28) then Ax (n) b, u (i) = 0, and so Ax (n) b is orthogonal to all u (i), i = 1,..., n. In order to see the implications of this statement we observe below that {u (1),..., u (n) } form a basis for IR n, and thus Ax (n) b = 0 (which proves the statement of the theorem). From the A-orthonormality of {u (1),..., u (n) } we know so that Au (j), u (i) = δ ij, U T AU = I. This implies that both A and U are nonsingular, and thus the columns u (i), i = 1,..., n, of U form a basis of IR n. It now remains to establish (28). From (26) we have t i = b Ax (i 1), u (i) = b Ax (0) + Ax (0) Ax (1) + Ax (1)... Ax (i 2) + Ax (i 2) Ax (i 1), u (i) = b Ax (0), u (i) + Ax (0) Ax }{{ (1), u } Ax (i 2) Ax (i 1), u }{{}, = t 1 Au (1) = t i 1 Au (i 1) where we have used (27). Thus, t i = b Ax (0), u (i) t 1 Au (1), u (i)... t i 1 Au (i 1), u (i) }{{}}{{} =0 =0 14

15 = Ax (0) b, u (i), which establishes (28). The main question now is How to find {u (1),..., u (n) }? We can use any basis of IR n, and then obtain {u (1),..., u (n) } by applying the Gram- Schmidt algorithm (with respect to the A-inner product) to this set. Therefore, we start with the canonical basis {e (1),..., e (n) }. Then and the general step is where u (1) = e(1) e (1) A = u (i) = i 1 e (1) e (1), Ae (1) v(i) v (i) A, v (i) = e (i) e (i), u (j) A u (j). Since e (i), u (j) A = e (i), Au (j) = (Au (j) ) i we have and u (1) = e(1) a11 i 1 v (i) = e (i) (Au (j) ) i u (j). Before stating the conjugate gradient algorithm we reformulate Theorem 4.23 for an A-orthogonal (instead of A-orthonormal) set. Theorem 4.24 Let {v (1),..., v (n) } be an A-orthogonal set, and define t i = b Ax(i 1), v (i) v (i) A. Then, for any starting vector x (0), the iteration x (i) = x (i 1) + t i v (i), i = 1,..., n,, yields Ax (n) = b. Proof: Obvious. Remark: Here the search directions v (i) are fixed before the algorithm is started. It is better to determine v (i) when needed during the algorithm by making sure that at each step the components in the previously used directions are removed. This is done in 15

16 Algorithm (Conjugate gradient method) (1) Input A, b, x (0), M, ɛ (2) r (0) = b Ax (0) (3) v (0) = r (0) (4) Output 0, x (0), r (0) (5) for k = 0 to M 1 do (6) if v (k) = 0 then stop (7) t k = r(k) 2 2 v (k) 2 A (8) x (k+1) = x (k) + t k v (k) (9) r (k+1) = r (k) t k Av (k) (10) if r (k+1) 2 < ɛ then stop (11) s k = r(k+1) 2 2 r (k) 2 2 (12) v (k+1) = r (k+1) + s k v (k) (13) Output k + 1, x (k+1), r (k+1) (14) Remarks: 1. A computer version (with lots of overwriting) is listed in the textbook on page The conjugate gradient algorithm is very efficient since the bulk of work for each iteration is the one matrix-vector multiplication Av (k) needed for the update of the residual. It is essential that this operation is customized for the matrix A. For many sparse matrices this can be done in O(n) operations, and thus the overall operations count for the conjugate gradient method for many problems is also O(n). 3. The Matlab script CGDemo.m illustrates the convergence behavior of the conjugate gradient algorithm. In this example the CG algorithm is applied to a symmetric positive definite matrix. The parameter τ affects the number of nonzero entries as well as the 2-norm condition number of A. This is done by replacing all off-diagonal entries of A with a ij > τ by zero, i.e., for small τ A is very sparse and well-conditioned, whereas for large τ both the number of non-zero entries as well as the condition number increase. We can see that for the ill-conditioned case no convergence at all is achieved in the allowed 20 iterations, whereas for the well-conditioned matrix the CG algorithm converges in 8 iterations to a solution with relative residual accuracy of

17 We now need to show that this algorithm agrees with the theory developed so far. Thus, we need to show {v (0), v (1),..., v (n 1) } is A-orthogonal (see c) in Theorem 4.25). Moreover, we will show that the residuals {r (0), r (1),..., r (n 1) } are orthogonal (cf. e)), and that they can be computed recursively as formulated in the algorithm (follows from d) below). Theorem 4.25 If we use the conjugate gradient algorithm to solve the linear system Ax = b, where A is an n n symmetric positive matrix, then for any m < n a) r (m), v (i) = 0, (0 i < m), b) r (i), r (i) = r (i), v (i), (0 i m), c) v (m), Av (i) = 0, (0 i < m), d) r (i) = b Ax (i), (0 i m), e) r (m), r (i) = 0, (0 i < m), f) r (i) 0, (0 i m). Remark: Items a) and b) are not directly related to the claims formulated above, but are required to prove the other statements. Proof: We use induction on m. The beginning, m = 0 is trivial. Now we assume that the statements a) f) hold for m and prove they are true for m + 1. a) Show r (m+1), v (i) = 0, (0 i m). First, we treat the case i = m. Using the recursive definition of the residuals from line (9) of the CG algorithm we have r (m+1), v (m) = r (m) t m Av (m), v (m) = r (m), v (m) t m Av (m), v (m). The definition of the stepsize, line (7), yields r (m+1), v (m) = r (m), v (m) r(m) 2 2 v (m) 2 Av (m), v (m) A = r (m), v (m) r (m), r (m) = 0, where the last simplification follows from the induction hypothesis for statement b). Now we discuss the case i < m. From line (9) of the algorithm we get r (m+1), v (i) = r (m) t m Av (m), v (i) = r (m), v (i) t m Av (m), v (i) }{{}}{{} = 0, =0 =0 where we have used the induction hypotheses for statements a) and c), respectively. b) We show r (m+1), r (m+1) = r (m+1), v (m+1). Using line (12) of the algorithm r (m+1), v (m+1) = r (m+1), r (m+1) + s m v (m) 17

18 = r (m+1), r (m+1) + s m r (m+1), v (m). }{{} =0 Here we have made use of the statement just proved in a) above. c) We need to show v (m+1), Av (i) = 0, for 0 i m. First we discuss the case 0 i < m. Line (12) of the algorithm yields v (m+1), Av (i) = r (m+1) + s m v (m), Av (i) = r (m+1), Av (i) + s m v (m), Av (i). Next, line (9) of the algorithm implies Av (k) = 1 t k (r (k) r (k+1)). Using this formula on the first term of the right-hand side above we have v (m+1), Av (i) = 1 t i r (m+1), r (i) r (i+1) + s m v (m), Av (i) = 1 t i r (m+1, v (i) s i 1 v (i 1) v (i+1) + s i v (i) + s m v (m), Av (i), where we have used line (12) (together with the understanding that s 1 = 0, v ( 1) = 0) to replace r (i) and r (i+1). We now expand the right-hand side, and get v (m+1), Av (i) = 1 t i r(m+1), v (i) s i 1 r (m+1), v (i 1) r (m+1), v (i+1) }{{}}{{}}{{} a) =0 a) =0 a) =0 = +s i r (m+1), v (i) }{{} + s m v (m), Av (i) }{{} =0 = 0. a) =0 for i<m Here we have used the result already proven in a) as well as the induction hypothesis for c). Note that the third and the fifth terms above are the reason why we have to now discuss the case i = m separately. We can proceed as above, but in the are left with v (m+1), Av (m) = 1 t m r (m+1), v (m+1) + s m v (m), Av (m). Now, replacing the stepsizes t m and s m using lines (7) and (11), respectively, we get v (m+1), Av (m) = v(m) 2 A r (m) 2 2 r (m+1), v (m+1) + r(m+1) 2 2 r (m) 2 v (m), Av (m) 2 = v(m) 2 A r (m) 2 r (m+1), r (m+1) + r(m+1) r (m) 2 v (m) 2 A 2 = 0. 18

19 where, at the, we have made use of statement b) proved above as well as the definition of the A-norm of a vector. d) We need to show r (m+1) = b Ax (m+1). From line (8) of the algorithm we have b Ax (m+1) = b A(x (m) + t m v (m) ) = b Ax (m) t m Av (m) = r (m) t m Av (m) by virtue of the induction hypothesis for d). Using line (9) of the algorithm we can replace t m Av (m) by a difference of residuals and have b Ax (m+1) = r (m) (r (m) r (m+1) ) = r (m+1). e) We show r (m+1), r (i) = 0 for 0 i m. From line (12) of the algorithm (with s 1 = v ( 1) = 0) we get r (m+1), r (i) = r (m+1), v (i) s i 1 v (i 1) = r (m+1), v (i) s i 1 r (m+1), v (i 1) = 0. }{{}}{{} a) =0 f) We need to show r (m+1) 0. Since A is positive definite, the quadratic form for v (m+1) 0 satisfies v (m+1), Av (m+1) > 0. On the other hand, using line (12) of the algorithm, v (m+1), Av (m+1) = r (m+1) + s m v (m), Av (m+1) = r (m+1), Av (m+1) + s m v (m), Av (m+1). }{{} Thus, r (m+1) cannot be zero (since A is positive definite and v (m+1) 0). Remark: The algorithms of this section were designed to minimize the quadratic form q(x) = x, Ax 2 x, b. We will now show they also minimize e (k) A. Let e (k) = x x (k), where x is the true solution of the linear system Ax = b. Then a) =0 c) =0 e (k) 2 A = e (k), Ae (k) = x x (k), A(x x (k) ) = x (k), Ax (k) 2 x (k), }{{} Ax + x, }{{} Ax =b =b = q(x (k) ) + x, b. Now, since x and b are constant, we see that minimization of q is equivalent to minimization of the A-norm of the error. Preconditioned CG We mentioned earlier that the number of iterations required for the conjugate gradient algorithm to converge is proportional to κ 2 (A). Thus, for poorly conditioned 19

20 matrices, convergence will be very slow. A commonly used trick is therefore to precondition the problem. The basic idea can be described as follows. We want to transform the linear system Ax = b into an equivalent system ˆx = ˆb, where κ(â) < κ(a). According to the comments above this should result in faster convergence. How do we find Â, ˆx, and ˆb? Consider a nonsingular n n matrix S. We can rewrite Our choice of S should be such that Ax = b S T Ax = S T b S} T {{ AS}} S 1 {{ x} = S}{{} T b. =ˆx = =ˆb SS T = Q 1 (29) with Q a symmetric positive definite matrix called splitting matrix or preconditioner. This will preserve the symmetry and positive definiteness of the system. Before discussing any specific preconditioners we consider how this framework affects the CG algorithm. Formally, the conjugate gradient algorithm for the problem ˆx = ˆb is (1) Input Â, ˆb, ˆx (0), M, ɛ (2) ˆr (0) = ˆb ˆx(0) (3) ˆv (0) = ˆr (0) (4) Output 0, ˆx (0), ˆr (0) (5) for k = 0 to M 1 do (6) if ˆv (k) = 0 then stop (7) ˆt k = ˆr(k) 2 2 ˆv (k) 2  (8) ˆx (k+1) = ˆx (k) + ˆt kˆv (k) (9) ˆr (k+1) = ˆr (k) ˆt k ˆv (k) (10) if ˆr (k+1) 2 < ɛ then stop (11) ŝ k = ˆr(k+1) 2 2 ˆr (k) 2 2 (12) ˆv (k+1) = ˆr (k+1) + ŝ kˆv (k) (13) Output k + 1, ˆx (k+1), ˆr (k+1) (14) 20

21 Remark: The algorithm is made more efficient if the preconditioning operations are included directly in the algorithm. We therefore make the following substitutions. ˆx (k) = S 1 x (k), ˆr (k) = ˆb ˆx(k) = S T b(s T AS)(S 1 x (k) ) = S T b S T Ax (k) = S T r (k), and also define ˆv (k) = S 1 v (k), r (k) = Q 1 r (k). Now we can consider how this transforms the algorithm above. Line (2) becomes ˆr (0) = ˆb ˆx(0) S T r (0) = S T b S T ASS 1 x (0) r (0) = b Ax (0), where we have multiplied by S T in the last step. For line (3) we have ˆv (0) = ˆr (0) S 1 v (0) = S T r (0) v (0) = Q 1 r (0), where we have multiplied by S and used the definition (29) of the preconditioner Q. Line (7) transforms as follows: ˆt k = ˆr (k) 2 2/ ˆv (k) 2  = S T r (k) 2 2/ S 1 v (k) 2 S T AS = S T r (k), S T r (k) / S 1 v (k), S T ASS 1 v (k) = r (k), SS T }{{} =Q 1 r (k) / v (k), Av (k) = r (k), r (k) / v (k) 2 A. Here we have made use of the easily established fact that for any n n matrix B x, By = B T x, y. Line (8) becomes ˆx (k+1) = ˆx (k) + ˆt kˆv (k) S 1 x (k+1) = S 1 x (k) + ˆt k S 1 v (k) x (k+1) = x (k) + ˆt k v (k), where we have multiplied by S in the last step. For line (9) we have ˆr (k+1) = ˆr (k) ˆt k ˆv (k) S T r (k+1) = S T r (k) ˆt k (S T AS)S 1 v (k) r (k+1) = r (k) ˆt k Av (k), where we have multiplied by S T. Line (11) transforms as follows: ŝ k = ˆr (k+1) 2 2/ ˆr (k)

22 = S T r (k+1) 2 2/ S T r (k) 2 2 = r (k+1), r (k+1) / r (k), r (k). The last step here requires some of the same manipulations as those used for line (7) above. Finally, for line (12) we have ˆv (k+1) = ˆr (k+1) + ŝ kˆv (k) S 1 v (k+1) = S T r (k+1) + ŝ k S 1 v (k) v (k+1) = Q 1 r (k+1) + ŝ k v (k) v (k+1) = r (k+1) + ŝ k v (k), where we have multiplied by S and used the definition of Q in the penultimate step. Therefore, the optimized algorithm for the preconditioned conjugate gradient algorithm is Algorithm (PCG) Input A, b, x (0), Q, M, ɛ r (0) = b Ax (0) Solve Q r (0) = r (0) for r (0) v (0) = r (0) Output 0, x (0), r (0) for k = 0 to M 1 do if v (k) = 0 then stop ˆt k = r(k), r (k) v (k) 2 A x (k+1) = x (k) + ˆt k v (k) r (k+1) = r (k) ˆt k Av (k) Solve Q r (k+1) = r (k+1) for r (k+1) if r (k+1), r (k+1) < ɛ then if r (k+1) 2 2 < ɛ then stop ŝ k = r(k+1), r (k+1) r (k), r (k) v (k+1) = r (k+1) + ŝ k v (k) Output k + 1, x (k+1), r (k+1) 22

23 Remark: This algorithm requires the additional work need to solve the linear system Q r = r once per iteration. Therefore we will want to choose Q so that this can be done easily and efficiently. The two extreme cases Q = I and Q = A are of no interest. Q = I gives us the ordinary CG algorithm, whereas Q = A (with (S = A 1/2 ) leads to the trivial preconditioned system ˆx = ˆb ˆx = ˆb since (using the symmetry of A)  = S T AS = A T/2 A T/2 A 1/2 A 1/2 = I. This may seem useful at first, but to get the solution x we need x = Sˆx = A 1/2ˆx = A 1/2ˆb = A 1/2 S T b = A 1/2 A T/2 b = A 1 b, which is just as complicated as the original problem. Therefore, Q should be chosen somewhere in between. Moreover, we want 1. Q should be symmetric and positive definite. 2. Q should be such that Q r = r can be solved efficiently. 3. Q should approximate A 1 in the sense that I Q 1 A < 1. Possible choices for Q If we use the decomposition A = L + D + L T matrix A then of the symmetric positive definite Q = D leads to Jacobi preconditioning, Q = L + D leads to Gauss-Seidel preconditioning, Q = 1 ω (D + ωl) leads to SOR preconditioning. Another popular preconditioner is Q = HH T, where H is close to L. This method is referred to as incomplete Cholesky factorization (see the book by Golub and van Loan for more details). Remark: The Matlab script PCGDemo.m illustrates the convergence behavior of the preconditioned conjugate gradient algorithm. The matrix A here is a symmetric positive definite matrix with all zeros except a ii = i on the diagonal, a ij = 1 on the sub- and superdiagonal, and a ij = 1 on the 100th sub- and superdiagonals, i.e., for i j = 100. The right-hand side vector is b = [1,..., 1] T. We observe that the basic CG algorithm converges very slowly, whereas the Jacobi-preconditioned method converges much faster. 23

24 An Overview of Iterative Solvers We with a decision tree (taken from the book by Demmel) containing rough guidelines for the use of most of the presently available iterative solvers. Branches to the left correspond to No, to the right to Yes. All methods mentioned belong to the family of Krylov subspace iterations. A symmetric A T available A definite storage expensive A wellconditioned A wellconditioned λ min, λ max known GMRES CGS, Bi-CGStab, GMRES(k) QMR CGN MINRES CGN CG CG with Chebyshev acceleration The abbreviations used are GMRES Generalized Minimum RESidual (1986), CGS Conjugate Gradient Squared (1989), Bi-CGStab Stabilized Bi-Conjugate Gradient (1992), QMR Quasi-Minimal Residual (1991), CGN Conjugate Gradient applied to the Normal equations, MINRES MINimum RESidual. Remark: A lot more details on iterative solvers can be found in the books by Demmel, Greenbaum, Golub and van Loan, and Trefethen and Bau. 24

Iterative Methods for Solving A x = b

Iterative Methods for Solving A x = b Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 20 1 / 20 Overview

More information

Chapter 7 Iterative Techniques in Matrix Algebra

Chapter 7 Iterative Techniques in Matrix Algebra Chapter 7 Iterative Techniques in Matrix Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128B Numerical Analysis Vector Norms Definition

More information

Iterative methods for Linear System

Iterative methods for Linear System Iterative methods for Linear System JASS 2009 Student: Rishi Patil Advisor: Prof. Thomas Huckle Outline Basics: Matrices and their properties Eigenvalues, Condition Number Iterative Methods Direct and

More information

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012. Math 5620 - Introduction to Numerical Analysis - Class Notes Fernando Guevara Vasquez Version 1990. Date: January 17, 2012. 3 Contents 1. Disclaimer 4 Chapter 1. Iterative methods for solving linear systems

More information

6.4 Krylov Subspaces and Conjugate Gradients

6.4 Krylov Subspaces and Conjugate Gradients 6.4 Krylov Subspaces and Conjugate Gradients Our original equation is Ax = b. The preconditioned equation is P Ax = P b. When we write P, we never intend that an inverse will be explicitly computed. P

More information

The Conjugate Gradient Method

The Conjugate Gradient Method The Conjugate Gradient Method Classical Iterations We have a problem, We assume that the matrix comes from a discretization of a PDE. The best and most popular model problem is, The matrix will be as large

More information

Lecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University

Lecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University Lecture Note 7: Iterative methods for solving linear systems Xiaoqun Zhang Shanghai Jiao Tong University Last updated: December 24, 2014 1.1 Review on linear algebra Norms of vectors and matrices vector

More information

CLASSICAL ITERATIVE METHODS

CLASSICAL ITERATIVE METHODS CLASSICAL ITERATIVE METHODS LONG CHEN In this notes we discuss classic iterative methods on solving the linear operator equation (1) Au = f, posed on a finite dimensional Hilbert space V = R N equipped

More information

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A. AMSC/CMSC 661 Scientific Computing II Spring 2005 Solution of Sparse Linear Systems Part 2: Iterative methods Dianne P. O Leary c 2005 Solving Sparse Linear Systems: Iterative methods The plan: Iterative

More information

7.2 Steepest Descent and Preconditioning

7.2 Steepest Descent and Preconditioning 7.2 Steepest Descent and Preconditioning Descent methods are a broad class of iterative methods for finding solutions of the linear system Ax = b for symmetric positive definite matrix A R n n. Consider

More information

Numerical Methods I Non-Square and Sparse Linear Systems

Numerical Methods I Non-Square and Sparse Linear Systems Numerical Methods I Non-Square and Sparse Linear Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 September 25th, 2014 A. Donev (Courant

More information

Lab 1: Iterative Methods for Solving Linear Systems

Lab 1: Iterative Methods for Solving Linear Systems Lab 1: Iterative Methods for Solving Linear Systems January 22, 2017 Introduction Many real world applications require the solution to very large and sparse linear systems where direct methods such as

More information

Lecture 11. Fast Linear Solvers: Iterative Methods. J. Chaudhry. Department of Mathematics and Statistics University of New Mexico

Lecture 11. Fast Linear Solvers: Iterative Methods. J. Chaudhry. Department of Mathematics and Statistics University of New Mexico Lecture 11 Fast Linear Solvers: Iterative Methods J. Chaudhry Department of Mathematics and Statistics University of New Mexico J. Chaudhry (UNM) Math/CS 375 1 / 23 Summary: Complexity of Linear Solves

More information

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for 1 Iteration basics Notes for 2016-11-07 An iterative solver for Ax = b is produces a sequence of approximations x (k) x. We always stop after finitely many steps, based on some convergence criterion, e.g.

More information

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009) Iterative methods for Linear System of Equations Joint Advanced Student School (JASS-2009) Course #2: Numerical Simulation - from Models to Software Introduction In numerical simulation, Partial Differential

More information

Lecture 18 Classical Iterative Methods

Lecture 18 Classical Iterative Methods Lecture 18 Classical Iterative Methods MIT 18.335J / 6.337J Introduction to Numerical Methods Per-Olof Persson November 14, 2006 1 Iterative Methods for Linear Systems Direct methods for solving Ax = b,

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 18 Outline

More information

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.

More information

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294) Conjugate gradient method Descent method Hestenes, Stiefel 1952 For A N N SPD In exact arithmetic, solves in N steps In real arithmetic No guaranteed stopping Often converges in many fewer than N steps

More information

Lecture # 20 The Preconditioned Conjugate Gradient Method

Lecture # 20 The Preconditioned Conjugate Gradient Method Lecture # 20 The Preconditioned Conjugate Gradient Method We wish to solve Ax = b (1) A R n n is symmetric and positive definite (SPD). We then of n are being VERY LARGE, say, n = 10 6 or n = 10 7. Usually,

More information

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems Topics The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems What about non-spd systems? Methods requiring small history Methods requiring large history Summary of solvers 1 / 52 Conjugate

More information

Course Notes: Week 1

Course Notes: Week 1 Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues

More information

FEM and sparse linear system solving

FEM and sparse linear system solving FEM & sparse linear system solving, Lecture 9, Nov 19, 2017 1/36 Lecture 9, Nov 17, 2017: Krylov space methods http://people.inf.ethz.ch/arbenz/fem17 Peter Arbenz Computer Science Department, ETH Zürich

More information

Algebra C Numerical Linear Algebra Sample Exam Problems

Algebra C Numerical Linear Algebra Sample Exam Problems Algebra C Numerical Linear Algebra Sample Exam Problems Notation. Denote by V a finite-dimensional Hilbert space with inner product (, ) and corresponding norm. The abbreviation SPD is used for symmetric

More information

9.1 Preconditioned Krylov Subspace Methods

9.1 Preconditioned Krylov Subspace Methods Chapter 9 PRECONDITIONING 9.1 Preconditioned Krylov Subspace Methods 9.2 Preconditioned Conjugate Gradient 9.3 Preconditioned Generalized Minimal Residual 9.4 Relaxation Method Preconditioners 9.5 Incomplete

More information

CAAM 454/554: Stationary Iterative Methods

CAAM 454/554: Stationary Iterative Methods CAAM 454/554: Stationary Iterative Methods Yin Zhang (draft) CAAM, Rice University, Houston, TX 77005 2007, Revised 2010 Abstract Stationary iterative methods for solving systems of linear equations are

More information

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit V: Eigenvalue Problems Lecturer: Dr. David Knezevic Unit V: Eigenvalue Problems Chapter V.4: Krylov Subspace Methods 2 / 51 Krylov Subspace Methods In this chapter we give

More information

6. Iterative Methods for Linear Systems. The stepwise approach to the solution...

6. Iterative Methods for Linear Systems. The stepwise approach to the solution... 6 Iterative Methods for Linear Systems The stepwise approach to the solution Miriam Mehl: 6 Iterative Methods for Linear Systems The stepwise approach to the solution, January 18, 2013 1 61 Large Sparse

More information

ITERATIVE METHODS BASED ON KRYLOV SUBSPACES

ITERATIVE METHODS BASED ON KRYLOV SUBSPACES ITERATIVE METHODS BASED ON KRYLOV SUBSPACES LONG CHEN We shall present iterative methods for solving linear algebraic equation Au = b based on Krylov subspaces We derive conjugate gradient (CG) method

More information

Numerical Methods - Numerical Linear Algebra

Numerical Methods - Numerical Linear Algebra Numerical Methods - Numerical Linear Algebra Y. K. Goh Universiti Tunku Abdul Rahman 2013 Y. K. Goh (UTAR) Numerical Methods - Numerical Linear Algebra I 2013 1 / 62 Outline 1 Motivation 2 Solving Linear

More information

From Stationary Methods to Krylov Subspaces

From Stationary Methods to Krylov Subspaces Week 6: Wednesday, Mar 7 From Stationary Methods to Krylov Subspaces Last time, we discussed stationary methods for the iterative solution of linear systems of equations, which can generally be written

More information

7.3 The Jacobi and Gauss-Siedel Iterative Techniques. Problem: To solve Ax = b for A R n n. Methodology: Iteratively approximate solution x. No GEPP.

7.3 The Jacobi and Gauss-Siedel Iterative Techniques. Problem: To solve Ax = b for A R n n. Methodology: Iteratively approximate solution x. No GEPP. 7.3 The Jacobi and Gauss-Siedel Iterative Techniques Problem: To solve Ax = b for A R n n. Methodology: Iteratively approximate solution x. No GEPP. 7.3 The Jacobi and Gauss-Siedel Iterative Techniques

More information

Iterative techniques in matrix algebra

Iterative techniques in matrix algebra Iterative techniques in matrix algebra Tsung-Ming Huang Department of Mathematics National Taiwan Normal University, Taiwan September 12, 2015 Outline 1 Norms of vectors and matrices 2 Eigenvalues and

More information

M.A. Botchev. September 5, 2014

M.A. Botchev. September 5, 2014 Rome-Moscow school of Matrix Methods and Applied Linear Algebra 2014 A short introduction to Krylov subspaces for linear systems, matrix functions and inexact Newton methods. Plan and exercises. M.A. Botchev

More information

Iterative Methods. Splitting Methods

Iterative Methods. Splitting Methods Iterative Methods Splitting Methods 1 Direct Methods Solving Ax = b using direct methods. Gaussian elimination (using LU decomposition) Variants of LU, including Crout and Doolittle Other decomposition

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

Conjugate Gradient (CG) Method

Conjugate Gradient (CG) Method Conjugate Gradient (CG) Method by K. Ozawa 1 Introduction In the series of this lecture, I will introduce the conjugate gradient method, which solves efficiently large scale sparse linear simultaneous

More information

Solving Sparse Linear Systems: Iterative methods

Solving Sparse Linear Systems: Iterative methods Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/sccs Lecture Notes for Unit VII Sparse Matrix Computations Part 2: Iterative Methods Dianne P. O Leary c 2008,2010

More information

Solving Sparse Linear Systems: Iterative methods

Solving Sparse Linear Systems: Iterative methods Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/sccswebpage Lecture Notes for Unit VII Sparse Matrix Computations Part 2: Iterative Methods Dianne P. O Leary

More information

Chapter 7. Iterative methods for large sparse linear systems. 7.1 Sparse matrix algebra. Large sparse matrices

Chapter 7. Iterative methods for large sparse linear systems. 7.1 Sparse matrix algebra. Large sparse matrices Chapter 7 Iterative methods for large sparse linear systems In this chapter we revisit the problem of solving linear systems of equations, but now in the context of large sparse systems. The price to pay

More information

Background. Background. C. T. Kelley NC State University tim C. T. Kelley Background NCSU, Spring / 58

Background. Background. C. T. Kelley NC State University tim C. T. Kelley Background NCSU, Spring / 58 Background C. T. Kelley NC State University tim kelley@ncsu.edu C. T. Kelley Background NCSU, Spring 2012 1 / 58 Notation vectors, matrices, norms l 1 : max col sum... spectral radius scaled integral norms

More information

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc. Lecture 11: CMSC 878R/AMSC698R Iterative Methods An introduction Outline Direct Solution of Linear Systems Inverse, LU decomposition, Cholesky, SVD, etc. Iterative methods for linear systems Why? Matrix

More information

Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Numerical Analysis Lecture Notes Peter J Olver 8 Numerical Computation of Eigenvalues In this part, we discuss some practical methods for computing eigenvalues and eigenvectors of matrices Needless to

More information

Goal: to construct some general-purpose algorithms for solving systems of linear Equations

Goal: to construct some general-purpose algorithms for solving systems of linear Equations Chapter IV Solving Systems of Linear Equations Goal: to construct some general-purpose algorithms for solving systems of linear Equations 4.6 Solution of Equations by Iterative Methods 4.6 Solution of

More information

Parallel Numerics, WT 2016/ Iterative Methods for Sparse Linear Systems of Equations. page 1 of 1

Parallel Numerics, WT 2016/ Iterative Methods for Sparse Linear Systems of Equations. page 1 of 1 Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations page 1 of 1 Contents 1 Introduction 1.1 Computer Science Aspects 1.2 Numerical Problems 1.3 Graphs 1.4 Loop Manipulations

More information

9. Iterative Methods for Large Linear Systems

9. Iterative Methods for Large Linear Systems EE507 - Computational Techniques for EE Jitkomut Songsiri 9. Iterative Methods for Large Linear Systems introduction splitting method Jacobi method Gauss-Seidel method successive overrelaxation (SOR) 9-1

More information

Numerical Linear Algebra And Its Applications

Numerical Linear Algebra And Its Applications Numerical Linear Algebra And Its Applications Xiao-Qing JIN 1 Yi-Min WEI 2 August 29, 2008 1 Department of Mathematics, University of Macau, Macau, P. R. China. 2 Department of Mathematics, Fudan University,

More information

Notes on Some Methods for Solving Linear Systems

Notes on Some Methods for Solving Linear Systems Notes on Some Methods for Solving Linear Systems Dianne P. O Leary, 1983 and 1999 and 2007 September 25, 2007 When the matrix A is symmetric and positive definite, we have a whole new class of algorithms

More information

Numerical Methods in Matrix Computations

Numerical Methods in Matrix Computations Ake Bjorck Numerical Methods in Matrix Computations Springer Contents 1 Direct Methods for Linear Systems 1 1.1 Elements of Matrix Theory 1 1.1.1 Matrix Algebra 2 1.1.2 Vector Spaces 6 1.1.3 Submatrices

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 19: Computing the SVD; Sparse Linear Systems Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical

More information

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA 1 SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA 2 OUTLINE Sparse matrix storage format Basic factorization

More information

Here is an example of a block diagonal matrix with Jordan Blocks on the diagonal: J

Here is an example of a block diagonal matrix with Jordan Blocks on the diagonal: J Class Notes 4: THE SPECTRAL RADIUS, NORM CONVERGENCE AND SOR. Math 639d Due Date: Feb. 7 (updated: February 5, 2018) In the first part of this week s reading, we will prove Theorem 2 of the previous class.

More information

Math 471 (Numerical methods) Chapter 3 (second half). System of equations

Math 471 (Numerical methods) Chapter 3 (second half). System of equations Math 47 (Numerical methods) Chapter 3 (second half). System of equations Overlap 3.5 3.8 of Bradie 3.5 LU factorization w/o pivoting. Motivation: ( ) A I Gaussian Elimination (U L ) where U is upper triangular

More information

5 Selected Topics in Numerical Linear Algebra

5 Selected Topics in Numerical Linear Algebra 5 Selected Topics in Numerical Linear Algebra In this chapter we will be concerned mostly with orthogonal factorizations of rectangular m n matrices A The section numbers in the notes do not align with

More information

Stabilization and Acceleration of Algebraic Multigrid Method

Stabilization and Acceleration of Algebraic Multigrid Method Stabilization and Acceleration of Algebraic Multigrid Method Recursive Projection Algorithm A. Jemcov J.P. Maruszewski Fluent Inc. October 24, 2006 Outline 1 Need for Algorithm Stabilization and Acceleration

More information

Linear Solvers. Andrew Hazel

Linear Solvers. Andrew Hazel Linear Solvers Andrew Hazel Introduction Thus far we have talked about the formulation and discretisation of physical problems...... and stopped when we got to a discrete linear system of equations. Introduction

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 17 1 / 26 Overview

More information

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725 Consider Last time: proximal Newton method min x g(x) + h(x) where g, h convex, g twice differentiable, and h simple. Proximal

More information

Iterative Methods and Multigrid

Iterative Methods and Multigrid Iterative Methods and Multigrid Part 3: Preconditioning 2 Eric de Sturler Preconditioning The general idea behind preconditioning is that convergence of some method for the linear system Ax = b can be

More information

Jordan Journal of Mathematics and Statistics (JJMS) 5(3), 2012, pp A NEW ITERATIVE METHOD FOR SOLVING LINEAR SYSTEMS OF EQUATIONS

Jordan Journal of Mathematics and Statistics (JJMS) 5(3), 2012, pp A NEW ITERATIVE METHOD FOR SOLVING LINEAR SYSTEMS OF EQUATIONS Jordan Journal of Mathematics and Statistics JJMS) 53), 2012, pp.169-184 A NEW ITERATIVE METHOD FOR SOLVING LINEAR SYSTEMS OF EQUATIONS ADEL H. AL-RABTAH Abstract. The Jacobi and Gauss-Seidel iterative

More information

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES 48 Arnoldi Iteration, Krylov Subspaces and GMRES We start with the problem of using a similarity transformation to convert an n n matrix A to upper Hessenberg form H, ie, A = QHQ, (30) with an appropriate

More information

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725 Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: proximal gradient descent Consider the problem min g(x) + h(x) with g, h convex, g differentiable, and h simple

More information

Conjugate Gradient Method

Conjugate Gradient Method Conjugate Gradient Method direct and indirect methods positive definite linear systems Krylov sequence spectral analysis of Krylov sequence preconditioning Prof. S. Boyd, EE364b, Stanford University Three

More information

In order to solve the linear system KL M N when K is nonsymmetric, we can solve the equivalent system

In order to solve the linear system KL M N when K is nonsymmetric, we can solve the equivalent system !"#$% "&!#' (%)!#" *# %)%(! #! %)!#" +, %"!"#$ %*&%! $#&*! *# %)%! -. -/ 0 -. 12 "**3! * $!#%+,!2!#% 44" #% &#33 # 4"!#" "%! "5"#!!#6 -. - #% " 7% "3#!#3! - + 87&2! * $!#% 44" ) 3( $! # % %#!!#%+ 9332!

More information

JACOBI S ITERATION METHOD

JACOBI S ITERATION METHOD ITERATION METHODS These are methods which compute a sequence of progressively accurate iterates to approximate the solution of Ax = b. We need such methods for solving many large linear systems. Sometimes

More information

EXAMPLES OF CLASSICAL ITERATIVE METHODS

EXAMPLES OF CLASSICAL ITERATIVE METHODS EXAMPLES OF CLASSICAL ITERATIVE METHODS In these lecture notes we revisit a few classical fixpoint iterations for the solution of the linear systems of equations. We focus on the algebraic and algorithmic

More information

Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright.

Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright. Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright. John L. Weatherwax July 7, 2010 wax@alum.mit.edu 1 Chapter 5 (Conjugate Gradient Methods) Notes

More information

Introduction to Scientific Computing

Introduction to Scientific Computing (Lecture 5: Linear system of equations / Matrix Splitting) Bojana Rosić, Thilo Moshagen Institute of Scientific Computing Motivation Let us resolve the problem scheme by using Kirchhoff s laws: the algebraic

More information

HOMEWORK 10 SOLUTIONS

HOMEWORK 10 SOLUTIONS HOMEWORK 10 SOLUTIONS MATH 170A Problem 0.1. Watkins 8.3.10 Solution. The k-th error is e (k) = G k e (0). As discussed before, that means that e (k+j) ρ(g) k, i.e., the norm of the error is approximately

More information

Chapter 2. Solving Systems of Equations. 2.1 Gaussian elimination

Chapter 2. Solving Systems of Equations. 2.1 Gaussian elimination Chapter 2 Solving Systems of Equations A large number of real life applications which are resolved through mathematical modeling will end up taking the form of the following very simple looking matrix

More information

Classical iterative methods for linear systems

Classical iterative methods for linear systems Classical iterative methods for linear systems Ed Bueler MATH 615 Numerical Analysis of Differential Equations 27 February 1 March, 2017 Ed Bueler (MATH 615 NADEs) Classical iterative methods for linear

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 21: Sensitivity of Eigenvalues and Eigenvectors; Conjugate Gradient Method Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis

More information

1. Nonlinear Equations. This lecture note excerpted parts from Michael Heath and Max Gunzburger. f(x) = 0

1. Nonlinear Equations. This lecture note excerpted parts from Michael Heath and Max Gunzburger. f(x) = 0 Numerical Analysis 1 1. Nonlinear Equations This lecture note excerpted parts from Michael Heath and Max Gunzburger. Given function f, we seek value x for which where f : D R n R n is nonlinear. f(x) =

More information

Iterative Methods for Linear Systems of Equations

Iterative Methods for Linear Systems of Equations Iterative Methods for Linear Systems of Equations Projection methods (3) ITMAN PhD-course DTU 20-10-08 till 24-10-08 Martin van Gijzen 1 Delft University of Technology Overview day 4 Bi-Lanczos method

More information

Some definitions. Math 1080: Numerical Linear Algebra Chapter 5, Solving Ax = b by Optimization. A-inner product. Important facts

Some definitions. Math 1080: Numerical Linear Algebra Chapter 5, Solving Ax = b by Optimization. A-inner product. Important facts Some definitions Math 1080: Numerical Linear Algebra Chapter 5, Solving Ax = b by Optimization M. M. Sussman sussmanm@math.pitt.edu Office Hours: MW 1:45PM-2:45PM, Thack 622 A matrix A is SPD (Symmetric

More information

Iterative Solution methods

Iterative Solution methods p. 1/28 TDB NLA Parallel Algorithms for Scientific Computing Iterative Solution methods p. 2/28 TDB NLA Parallel Algorithms for Scientific Computing Basic Iterative Solution methods The ideas to use iterative

More information

Preface to the Second Edition. Preface to the First Edition

Preface to the Second Edition. Preface to the First Edition n page v Preface to the Second Edition Preface to the First Edition xiii xvii 1 Background in Linear Algebra 1 1.1 Matrices................................. 1 1.2 Square Matrices and Eigenvalues....................

More information

Lecture notes: Applied linear algebra Part 1. Version 2

Lecture notes: Applied linear algebra Part 1. Version 2 Lecture notes: Applied linear algebra Part 1. Version 2 Michael Karow Berlin University of Technology karow@math.tu-berlin.de October 2, 2008 1 Notation, basic notions and facts 1.1 Subspaces, range and

More information

Math/Phys/Engr 428, Math 529/Phys 528 Numerical Methods - Summer Homework 3 Due: Tuesday, July 3, 2018

Math/Phys/Engr 428, Math 529/Phys 528 Numerical Methods - Summer Homework 3 Due: Tuesday, July 3, 2018 Math/Phys/Engr 428, Math 529/Phys 528 Numerical Methods - Summer 28. (Vector and Matrix Norms) Homework 3 Due: Tuesday, July 3, 28 Show that the l vector norm satisfies the three properties (a) x for x

More information

Computational Linear Algebra

Computational Linear Algebra Computational Linear Algebra PD Dr. rer. nat. habil. Ralf Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2017/18 Part 3: Iterative Methods PD

More information

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit II: Numerical Linear Algebra Lecturer: Dr. David Knezevic Unit II: Numerical Linear Algebra Chapter II.3: QR Factorization, SVD 2 / 66 QR Factorization 3 / 66 QR Factorization

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 11 Partial Differential Equations Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002.

More information

Preconditioning Techniques Analysis for CG Method

Preconditioning Techniques Analysis for CG Method Preconditioning Techniques Analysis for CG Method Huaguang Song Department of Computer Science University of California, Davis hso@ucdavis.edu Abstract Matrix computation issue for solve linear system

More information

Introduction to Iterative Solvers of Linear Systems

Introduction to Iterative Solvers of Linear Systems Introduction to Iterative Solvers of Linear Systems SFB Training Event January 2012 Prof. Dr. Andreas Frommer Typeset by Lukas Krämer, Simon-Wolfgang Mages and Rudolf Rödl 1 Classes of Matrices and their

More information

Conjugate Gradient Tutorial

Conjugate Gradient Tutorial Conjugate Gradient Tutorial Prof. Chung-Kuan Cheng Computer Science and Engineering Department University of California, San Diego ckcheng@ucsd.edu December 1, 2015 Prof. Chung-Kuan Cheng (UC San Diego)

More information

Iterative Methods for Sparse Linear Systems

Iterative Methods for Sparse Linear Systems Iterative Methods for Sparse Linear Systems Luca Bergamaschi e-mail: berga@dmsa.unipd.it - http://www.dmsa.unipd.it/ berga Department of Mathematical Methods and Models for Scientific Applications University

More information

Computational Economics and Finance

Computational Economics and Finance Computational Economics and Finance Part II: Linear Equations Spring 2016 Outline Back Substitution, LU and other decomposi- Direct methods: tions Error analysis and condition numbers Iterative methods:

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 24: Preconditioning and Multigrid Solver Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 5 Preconditioning Motivation:

More information

FEM and Sparse Linear System Solving

FEM and Sparse Linear System Solving FEM & sparse system solving, Lecture 7, Nov 3, 2017 1/46 Lecture 7, Nov 3, 2015: Introduction to Iterative Solvers: Stationary Methods http://people.inf.ethz.ch/arbenz/fem16 Peter Arbenz Computer Science

More information

Notes for CS542G (Iterative Solvers for Linear Systems)

Notes for CS542G (Iterative Solvers for Linear Systems) Notes for CS542G (Iterative Solvers for Linear Systems) Robert Bridson November 20, 2007 1 The Basics We re now looking at efficient ways to solve the linear system of equations Ax = b where in this course,

More information

Math 411 Preliminaries

Math 411 Preliminaries Math 411 Preliminaries Provide a list of preliminary vocabulary and concepts Preliminary Basic Netwon s method, Taylor series expansion (for single and multiple variables), Eigenvalue, Eigenvector, Vector

More information

Lecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm

Lecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm CS 622 Data-Sparse Matrix Computations September 19, 217 Lecture 9: Krylov Subspace Methods Lecturer: Anil Damle Scribes: David Eriksson, Marc Aurele Gilles, Ariah Klages-Mundt, Sophia Novitzky 1 Introduction

More information

Math 1080: Numerical Linear Algebra Chapter 4, Iterative Methods

Math 1080: Numerical Linear Algebra Chapter 4, Iterative Methods Math 1080: Numerical Linear Algebra Chapter 4, Iterative Methods M. M. Sussman sussmanm@math.pitt.edu Office Hours: MW 1:45PM-2:45PM, Thack 622 March 2015 1 / 70 Topics Introduction to Iterative Methods

More information

The Solution of Linear Systems AX = B

The Solution of Linear Systems AX = B Chapter 2 The Solution of Linear Systems AX = B 21 Upper-triangular Linear Systems We will now develop the back-substitution algorithm, which is useful for solving a linear system of equations that has

More information

Lecture 9: Numerical Linear Algebra Primer (February 11st)

Lecture 9: Numerical Linear Algebra Primer (February 11st) 10-725/36-725: Convex Optimization Spring 2015 Lecture 9: Numerical Linear Algebra Primer (February 11st) Lecturer: Ryan Tibshirani Scribes: Avinash Siravuru, Guofan Wu, Maosheng Liu Note: LaTeX template

More information

Notes on PCG for Sparse Linear Systems

Notes on PCG for Sparse Linear Systems Notes on PCG for Sparse Linear Systems Luca Bergamaschi Department of Civil Environmental and Architectural Engineering University of Padova e-mail luca.bergamaschi@unipd.it webpage www.dmsa.unipd.it/

More information

APPLIED NUMERICAL LINEAR ALGEBRA

APPLIED NUMERICAL LINEAR ALGEBRA APPLIED NUMERICAL LINEAR ALGEBRA James W. Demmel University of California Berkeley, California Society for Industrial and Applied Mathematics Philadelphia Contents Preface 1 Introduction 1 1.1 Basic Notation

More information

Notes on Eigenvalues, Singular Values and QR

Notes on Eigenvalues, Singular Values and QR Notes on Eigenvalues, Singular Values and QR Michael Overton, Numerical Computing, Spring 2017 March 30, 2017 1 Eigenvalues Everyone who has studied linear algebra knows the definition: given a square

More information

Introduction. Math 1080: Numerical Linear Algebra Chapter 4, Iterative Methods. Example: First Order Richardson. Strategy

Introduction. Math 1080: Numerical Linear Algebra Chapter 4, Iterative Methods. Example: First Order Richardson. Strategy Introduction Math 1080: Numerical Linear Algebra Chapter 4, Iterative Methods M. M. Sussman sussmanm@math.pitt.edu Office Hours: MW 1:45PM-2:45PM, Thack 622 Solve system Ax = b by repeatedly computing

More information