CHAPTER 5. Basic Iterative Methods

Basic Iterative Methods CHAPTER 5 Solve Ax = f where A is large and sparse (and nonsingular. Let A be split as A = M N in which M is nonsingular, and solving systems of the form Mz = r is much easier than solving Ax = f. Iteration: Convergence: Mx k+ = Nx k + f x 0 := arbitrary x k+ = M Nx k + M f = Hx k + g but, x = Hx + g. Then if δx k = x k x, we have δx k+ = Hδx k, i.e., or δx = Hδx 0 δx = Hδx = H δx 0. δx k = H k δx 0 δx k H k δx 0 H k δx 0. Thus, a sufficient condition for convergence ( lim k H k = 0 is H <. Necessary condition: Let x 0 = 0, then x k+ = (I + H + H + + H k g k lim ( H j converges iff ρ(h <. In this case x k (I H g. Going back to k j=0 δx H k δx 0 since then ρ(h H ρ(h + ɛ ; ɛ > 0. δx k [ρ(h + ɛ] k δx 0

If δx 0 is an eigenvector of H corresponding to the eigenvalue of max. modules, i.e., Hδx 0 = µδx 0 ; ρ(h = µ δx k = H k δx 0 = µ k δx 0 & δx k = ρ k (H δx 0. Now, to reduce the error by a factor of 0 m, m > 0, we need have ρ k (H 0 m or 0 m = [/ρ(h] k, i.e., m k log 0 [/ρ(h] = k R where R is called the asymptotic rate of convergence of the iterative scheme, and k m R = # of iterations needed to reduce the error by a factor of 0 m. m log 0 ρ(h Example: Let ρ(h = 0.9 and m = 6, then k 6 k 9 iterations. log 0 (0. 6 log 0 (0.9 30 iterations. However, if ρ(h = 0., then Algorithm: x k+ = M Nx k + M f but A = M N, or N = M A, then Summary: x := arbitrary r := f Ax while r tolerance x k+ = M (M Ax k + M f = x k + M (f Ax k = x k + M r k Solve Md = r x := x + d r := f Ax end

Example: Solve Ax = f Exact solution: Let A = M N, and x T 0 = (0, 0, 0. Then A = 4 M = 4 0 4 0 4 4 x = 4 ; f =. ; N = H = M N = 4 0 0 0 0 0 3 3 0 0 0 0 0 x k+ = Hx k + g ; g = M f = 4 λ k (H = 4 ( cos kπ 4 k =,, 3. 3 3 i.e., λ(h := 4 ; 0 ; 4 ρ(h = 4 0.3536, & R = log 0 (0.3536 0.45. Normally, the stopping criterion is either r k < tol, or r k / r 0 tol, where tol and tol are tolerances to be determined by the user. A stopping criterion based on the error in the solution is given by the following theorem. Theorem: Let A = M N, with H = M N, and H <. Thus x x k H H x k x k k =,, 3,... 3

Proof: subtracting, we get and by recursion, we have x k+ = Hx k + g ; g = M f x k = Hx k + g x k+ x k = H(x k x k Since, we have But p m=0 Thus, since H <, we get (x m+k+ x m+k = H m+ (x k x k. (x k+p x k = (x k+p x k+p + (x k+p x k+p + x k+p x k = p m=0. + (x k+ x k p (x k+m+ x k+m m=0 H m+ x k x k. H m+ = H [+ H + H + + H p ]. ( H (+ H + H + + H p = H p i.e., as p x k+p x k H ( H p H x x k x k x k H H x k x k. 4

Example: Since x = [ A = ( 3 4 A = M N ; M = x 0 = 0 ; H = M N = N = ; f = ( ( 0 0 4 ( 0 3 0 ( 0 / 3/4 0 H = H 3/4 < x 3 = [ ] 6 ; x 3 3 4 = [ 55 64 55 x x 4 3/4 3/4 x 4 x 3 = 3 64 9 = 7 64. ], then x x 4 = 9 or /3 of the upper bound above. 64 Acceleration of the Iterative Scheme A = M N x k+ = x k + M r k (Original scheme [r k = f Ax k ]. From (I Hx = g, H = M N; g = M f, the original scheme may also be expressed as, We can accelerate this basic scheme via x k+ = Hx k + g = x k + [g (I Hx k ] = x k + r k. ] x k+ = x k + αˆr k ( in which α is a scalar and ˆr k = M r k. Iteration (** could be written as x k+ = [I αg]x k + αg 5

where G = (I H. Necessary condition for convergence is: ρ(i αg <. Objective: Choose α so that ρ(i αg is minimized. Let λ(g := [λ min, λ max ]. Let λ k (I αg = µ k, thus αλ max µ k αλ min. Note that if λ min < 0, then for some k, µ k > and the iteration diverges. Therefore, we assume that λ min > 0, i.e., that the basic scheme is convergent. Hence 0 < λ min < λ max. Since µ k = αλ k, we have the st. lines αλ max µ.0 αλ min 0 λ max α opt λ min α for α optimal, we must have and ( αλ max = ( αλ min α optimal = λ min + λ max ρ(i α optimal G = λ min λ min +λ max = λmax λ min λ min +λ max Diagonally Dominant Matrices * A (n n is strictly diagonal dominant if: a ii > a ij. j= j i * A is weakly diagonally dominant if: a ii 6 a ij. j= j i

* A is irreducibly diagonally dominant if: (i there does not exist a permutation P for which [ P AP B B = 0 B ] where B and B are square matrices. (ii a ii a ij with strict inequality holding for at least one row k. j= j i Gerschgorin s Theorems. Let A C n n, then each eigenvalue of A lies in one of the disks in the complex plane λ a ii a ij. Proof: Consider the eigenvalue problem Ax = λx where, x k max j n x j. Since (λi Ax = 0, then from the k-th row (λ a kk x k = j= j i a kj x j. j= j i λ a kk = x k a kj x j j= j= j k j k a kj x j x k or λ a kk a kj. j= j k 7

Example: A = 3 5 3 5 r,3 = 5 r = -4 4 0 5 7-5 λ(a := 0.7898 ± 3.885i 5.405 All 3 eigenvalues are contained in the union of the 3 disks.. If k Gerschgorin disks are disjoint from the rest, then exactly k eigenvalues lie in the union of the k disks. Example: A = / / 5 / / λ(a := 0.5,.5, 5.0 8

0 3 5 7 eigenvalues here one eigenvalue here Theorem: If A is strictly diagonally dominant or irreducibly diagonally dominant, then it is nonsingular. Proof: (a If A is strictly diagonally dominant then zero is not contained in the union of the G-disks. o a ii > j i a ij (b If A is irreducibly diagonally dominant, then let A be singular, i.e., there exists an x s.t. Ax = 0. Let m be that row of A for which a mm > a mj. j= j m Let J be the set of indices J = {k : x k x i, i =,,..., n, and x k > x j for j n}. Clearly, J is not empty, for this would imply x = x = = x n 0 9

contradicting a mm > j m a mj. Now, for any k J, we have a kk j k ( a kj x j. x k Thus, to maintain the property of diagonal dominance, we must have a kj = 0 whenever x j x k < for k J; j J. But, since A is irreducible, this is a contradiction. Q.E.D. Theorem: If A is strictly diagonally dominant, or irreducibly-diagonally dominant, then the associated Jacobi and Gauss-Seidel iterations converge for any initial iterate x 0. Proof: Let A = D L U, where D is diagonal and L, U are strictly lower and upper triangular, respectively. (a strictly diagonally dominant case: (i Jacobi: i.e., H J is of the form H J = A = M J N J M J = D ; N J = D + L H J = M J N J = M J (M J A = I M J A = (I D A 0 a a a 0 a n a a........... a n,n a n,n a n a a nn n,n a n,n 0. Thus, from the Gerschgorin Theorems, 0

ρ( H J < 0 ρ ρ < as well as, H J < sufficient condition for convergence. (ii Gauss-Seidel: Consider the eigenvalue problem H G.S. = (D L U. H G.S. w = λw or Uw = λ(d Lw. Also, let w be scaled such that for some m, w m =, and w i. Then, from j>ma [ mj w j = λ a mm w m + ] a mj w j j<m i.e., or λ = a mj w j / a mmw m + j<ma mj w j j>m λ a mj j>m a mm a mj j<m = σ d σ = σ σ +(d σ σ

where, d = a mm ; σ = j>m a mj, σ = j<m am j and d > (σ + σ. Hence, λ < and G.S. converges. (a Irreducibly diagonally dominant case: (i Jacobi: Property A: From the Gerschgorin theorems, we can only show that ρ(m J N J. Let ρ(h J =, where H J = M J N J then (H J λi is singular with λ =. Hence, for λ =, M J (N J λm J is singular. Since A = M J N J is irreducibly diagonally dominant, then for any scalar α with α =, we see that (αm J N J is also irreducibly diagonally dominant, i.e., (αm J N J is nonsingular. Then, (λm J N J is nonsingular for λ = contradiction if λ is an eigenvalue of H J. Consequently, ρ(h J <. (ii Gauss-Seidel: The proof that ρ(h G.S. < is identical to that of Jacobi above. A matrix A is -cyclic, or has property A if there is a permutation P such that [ ] P AP T D E = F D where D and D are diagonal. The tridiagonal matrix A =.........

has property A with the Permutation P given by, ( D E Let A = F F P T = [e, e 3, e 5,... ; e }{{}, e 4, e 6,...]. }{{} odd even with D, D diagonal. (i Jacobi: or H J = ( D 0 0 D ( H J = D Now consider the eigenvalue problem ( 0 D D F 0 for convergence we must have λ J <. ( 0 E F 0 0 D E F 0 ( x y E. ( x = λ y (+D F (D Ey = λ Jy (ii Gauss-Seidel: Hence, H GS = For convergence we must have = ( D 0 [ H GS = F D ( 0 E 0 0 D 0 D F D D [ 0 D E 0 D F D E ] [ 0 E 0 ] 0. λ(h GS := 0 ; λ GS (D F D E. ] λ GS [(D F (D E]. Thus, λ GS = λ J or GS convergence twice as fast as Jacobi for matrices with property A. 3

Positive Definite Systems Theorem: Let A be symmetric, with A = M N, in which M is nonsingular, and S = M T + N is positive definite. Then, H = M N = I M A is convergent (i.e., ρ(h < iff A is positive definite. Proof: (sufficiency. Let λ, u be an eigenpair of H, i.e., Hu = λu or Hence, and From (, we have or, From (, we get Adding (3 and (4, we get (I M Au = λu Au = ( λmu. ( λ = u Au u Mu ( ( λ = u Au u M T u. ( λ (u Au = u (A + N T u = u Au + u N T u ( λ (u Au λ = u N T u. (3 ( (u Au = u Mu. (4 λ [ ] (u λ Au ( λ( λ = u S T u. Thus, if λ = α + iβ, with i =, then [ ] λ u S T u = (u Au γ where γ = ( α + β > 0. As a result, if A is s.p.d., and S is s.p.d. then ρ(h <. 4

Observation : Let A = D L L T s.p.d. Jacobi M = D ; e T j De j > 0 S = M T + N = D + L + L T Jacobi converges if S is s.p.d. Observation : Gauss- Seidel M = D L ; N = L T S = M T + N = D L T + L T = D s.p.d. Gauss-Seidel converges for s.p.d. systems. Symmetric Gauss-Seidel Let A be s.p.d. and given by A = D L L T. D is diagonal, and L is strictly lower triangular. Iteration for solving Ax = f Let, L = D / LD / y k = D / x k. (D Lx k+/ = L T x k + f (D L T x k+ = Lx k+/ + f. Then the symmetric G.S. iteration is given by, or Let (I L T y k+ = z k+. Then, (I Ly k+/ = L T y k + D / f (I L T y k+ = Ly k+/ + D / f (I L T y k+ = L(I L L T (I L T (I L T y k + [I + L(I L ]D / f. z k+ = L(I L L T (I L T z k + h = Kz k + h. 5

Since L is strictly lower triangular (diagonal elements are all zero, then or L(I L = L(I + L + L + + L n L(I L = (I + L + L + + L n L = (I L L. Thus, the iteration matrix K is given by, K = (I L LL T (I L T i.e., K is symmetric positive semidefinite, or λ(k 0 are all real. Also, I K = I (I L LL T (I L T = (I L [(I L(I L T LL T ](I L T i.e., and (I K is s.p.d., or λ(k > 0 (I K = (I L [I L L T ](I L T = (I L [D / AD / ](I L }{{} s.p.d. 0 λ(k <. Thus, symmetric G.S. converges for s.p.d. systems. 6