Lecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University

Lecture Note 7: Iterative methods for solving linear systems Xiaoqun Zhang Shanghai Jiao Tong University Last updated: December 24, 2014

1.1 Review on linear algebra Norms of vectors and matrices vector norm: A vector norm on R n is a function,, from R n into R with the following properties: x 0 for all x R n. x = 0 if and only if x = 0. αx = α x for all α R and x R n. x + y x + y Examples: l 2 : x 2 = n i=1 x 2 i l : x = max x i 1 i n l 1 : x 1 = x i 1 i n l p : x 1 = ( x i p ) 1/p, for p 1 1 i n l 2 norm is also called the Euclidean norm representing the usual notion of distance from the origin in case x in R, R 2, R 3. Figures of l 1, l 2, l norm in R 2, R 3. distance between vectors: norm of difference of two vectors x y. It can be used to measure the error between the true solution and an approximated one. denote convergence of a sequence: if x (k) x 0 then lim x (k) i = x i. Important inequalities: (Cauchy-Schwarz Inequality) For each x = (x 1, x 2,, x n ) t and y = (y 1,, y n ) t in R n, x t y = x i y i n x 2 i n yi 2 = x y i=1 x x 2 n x. i=1 Matrix norm: A matrix norm on the set of all n n matrices is a real valued function,, defined on this set, satisfying for n n matrices A and B and all real numbers α: 2 i=1

A 0 A = 0, if and only if A is O, the matrix with all 0 entries. αa = α A. A + B A + B. AB A B. Distance of two matrices: A B. Natural/Induced norm: if is a vector norm in R n,then A = max x =1 Ax is a matrix norm. This is called natural, or induced matrix norm. We also have A = max Ax = max A z x =1 z 0 z = max Az z 0 z This definition leads to Examples of matrix norms: Natural norms: Proof. A = Az A z max Ax = max x =1 1 i n a ij Show A max 1 i n n a ij. Take x such that 1 = x = max 1 i n x i Ax = max (Ax) i = max n a ij x j max 1 i n 1 i n 1 i n a ij max 1 j n x j But x = max 1 j n x j = 1, so the inequality is proved. Show A max 1 i n n a ij. Let p be an integer with a pj = max 1 i n a kj and x be the vector such that x j = 1 if a pj 0, x j = 1 otherwise. Then x = 1 and a pj x j = a pj, for all j = 1, n. So Ax = max n a ij x j 1 i n 3 a pj x j = max 1 i n a ij

This implies that A = max Ax max x =1 1 i n a ij A 1 = max Ax 1 = max x 1=1 1 j n a ij i=1 A 2 = max Ax 2 = ρ(a T A) x 2=1 where λ is an eigenvalue of ρ(a T A) denotes the maximal eigenvalue of A T A. Matlab command: norm(a, Non-natural norm: Frobenious norm A F = ( a ij 2 Some useful inequalities for natural norms: A 2 A F n A 2 Ax 2 A F x 2 ρ(a) A for any natural norm. i=1 1.2 Iterative methods for linear system We will introduce Jacobi, Gauss-Seidel iterative and SOR methods, classic methods that date to the late eighteenth century. Iterative techniques are seldom used for solving linear systems of small dimension since the time required for sufficient accuracy exceeds that required for direct techniques such as Gaussian elimination. For large systems with a high percentage of 0 entries, however, these techniques are efficient in terms of both computer storage and computation. Numerical Iterative methods to solve Ax = b: starts with an initial approximation x (0) to the solution x ad generates a sequence of vectors {x (k) } k=0 that converges to x. Iterative techniques involve a process that converts the system Ax = b into an equivalent system of the form x = T x + c for some fixed matrix T and vector c. Fixed point iteration: x (k+1) = T x (k) + c 4

Jacobi iterative method Transfer the linear system: The Jacobi iterative method is obtained by solving the i-th equation in Ax = b for x i to obtain (provided a ii 0) Generate each x (k) i Examples: Solve x i = 1 a ii (b i x (k) i = 1 a ii (,j i from x (k 1) for k 1 by,j i Solution x = (1, 2, 1, 1). with a ij x j ), for i = 1, 2,, n ( a ij x (k) j ) + b i ), for i = 1, 2,, n (1.1) 10x 1 x 2 + 2x 3 = 6 x 1 + 11x 2 x 3 + 3x 4 = 25 2x 1 x 2 + 10x 3 x 4 = 11 3x 2 x 3 + 8x 4 = 15 x (0) = (0, 0, 0, 0) x (1) = (0.6000, 2.2727, 1.1000, 1.8750),, x (10) = (1.0100, 1.998, 0.9998, 0.9998) and the relative error with respect to l is less than 10 3. Matrix form. Decomposition of A = D L U: where D is the diagonal matrix whose diagonal entries are those of A, L is the strictly lower-triangular part of A, and U be the strictly upper-triangular part of A. Let Dx = (L + u)x + b., if D 1 exists, that is, a ii 0 for each i, then x (k) = D 1 (L + u)x (k 1) + D 1 b, for k = 1, 2, Let T j = D 1 (L + u), c j = D 1 b, then x (k) = T j x (k 1) + c j In practice, this form is only used for theoretical purposes while the previous one (??) is used in computation. 1 0 10 1 5 0 1 1 3 In previous example: T j = 11 0 11 11 1 1 1 5 10 0, b = ( ) 3 25 11 15 5 11 10 8 10 0 3 1 8 8 0 5

Jacobi Iterative algorithm: to solve Ax = b given an initial approximation x (0) : Input: the number of equations and unknowns n; the entries a ij of A; the entries b, X (0),tolerance TOL, maximum number of iterations N; Output: the approximate solution x or a message that the number of iterations was exceeded. 1. Set k = 1 2. while k < N do (a) For i = 1,, n, set x i = 1 a ii [ n,j i (a ijx (0) j ) + b i ]; (b) If x x (0) < T Ol, then output x. (c) Set k = k + 1. (d) For i = 1,, n, set x (0) i = x i. 3. Output ( maximum number of iterations exceeded ). stop. Comments on the algorithm: the algorithm requires that a ii 0, for each i = 1, 2,, n. If one of the aii entries is 0 and the system is nonsingular, a reordering of the equations can be performed so that no a ii = 0. To speed convergence, the equations should be arranged so that a ii is as large as possible. Another possible stopping criterion is to iterate until the relative error is smaller than some tolerance. For this purpose, any convenient norm can be used, the usual being the l norm. 1.2.1 Gauss-Seidel method In Jacobi iterative method, the components of x (k 1) are used to compute all the components x (k) i of x (k). But for i > 1, the components x (k) 1, x(k) 2,, x(k) i 1 of x (k) have already been computed and are expected to be better approximations to the actual solutions x 1, c, x i 1 than are x (k 1) 1,, x (k 1) i 1. It seems reasonable, then, to compute x (k) i using these most recently calculated values. Gaussian-Seidel iterative technique x (k) i = 1 i 1 [b i a ij x (k) j a ii j=i+1 a ij x (k 1) j ] (1.2) Example: 10x 1 x 2 + 2x 3 = 6 x 1 + 11x 2 x 3 + 3x 4 = 25 2x 1 x 2 + 10x 3 x 4 = 11 3x 2 x 3 + 8x 4 = 15 6

Solution x = (1, 2, 1, 1). with and iterative until the relative error x (0) = (0, 0, 0, 0) x (k) x (k 1) x (k) 10 3 For the Gauss-Seidel method we write the system, for each k = 1, 2, : When x (0) = (0, 0, 0, 0) t, we have x (1) = (0.6000, 2.3272,.0.9873, 0.8789) t, until x (5) = (1.0001, 2.0000, 1.0000, 1.0000). since the relative error is 4 10 4 and x (5) is accepted as a reasonable approximation to the solution. Note that, in an earlier example, Jacobis method required twice as many iterations for the same accuracy. Matrix form: write first in equation form and, then with the definitions of D, L, and U given previously, we have the Gauss-Seidel method represented by x (k) = (D L) 1 Ux (k 1) + (D L) 1 b Letting T g = (D L) 1 U and c g = (D L) 1 b, gives the Gauss-Seidel technique the form x (k) = T g x (k 1) + c g For the lower-triangular matrix D L to be nonsingular, it is necessary and sufficient that a ii 0, for each i = 1, 2,, n. 1.2.2 Convergence Convergent matrix : An n n matrix is convergent if lim k (A)k ij = 0, for eachi, j = 1, 2,, n Theorem: the following statements are equivalent: A is convergent matrix lim n A n = 0, for some natural norms lim n A n = 0, for all norms ρ(a) < 1 lim n A n x = 0 for every x. Convergence results for General iteration methods: where x (0) is arbitrary. x (k) = T x k 1 + c, for eachk = 1, 2, 7

Lemma: If the spectral radius ρ(t ) satisfies ρ(t ) < 1, then (I T ) 1 exists, and (I T ) 1 = I + T + T 2 + = Proof. T x = λx (I T )x = (1 λ)x and λ ρ(t ) < 1, so I T is invertible. Let S m = I + T + T 2 + + T m, (I T )S m = I T m+1. Since T is convergent, then i=0 T j lim (I T )S m = lim(i T m+1 ) = I m Theorem: The iteration x (k) = T x k 1 + c, for eachk = 1, 2, converges to the unique solution of x = T x + c if an only if ρ(t ) < 1. Proof. Assume ρ(t ) < 1, then x (k) = T k x (0) + (T k 1 + + T + I)c. Thus lim k x (k) = 0+(I T ) 1 c. Thus x (k) converges to x = (I T ) 1 c and x = T x + c. To prove the converse, let x be the unique solution to x = T x + c and z be arbitrary vector. Define x (0) = x z and for k 1, (k) = T x (k 1) + c, then x (k) converges to x and x x (k) = T (x x (k 1) ) = = T (k) (x x (0) ) = T k z thus lim k T k z = lim k x (k) x = 0 Since z R n is arbitrary, so by the theorem on convergent matrices, T is convergent and ρ(t ) < 1. Corollary: T < 1 for any natural matrix norm and c is a given vector, then the sequence {x (k) } converges with any x (0) to x with x = T x + c and the error bounds hold x x (k) T k x (0) x x x (k) T k 1 T x(1) x (0) Note: proof is similar to the fixed point theorem for a single nonlinear equation. Convergence of the Jacobi&Gauss-Seidel iterative method for special types of matrices: Jacobi : T j = D 1 (L + U) Gauss-Seidel : 8 T g = (D L) 1 U

Theorem 1 If A is strictly diagonally dominant, then for any x (0), both Jacobi and Gauss-Seidel methods converges to the unique solution Ax = b. Proof. Jacobi: T j = D 1 (L + U), thus the i-th row of T j is ( a i1 /a ii,, a i,i 1 /a ii, 0, a i,i+1 /a ii,, a i,n /a ii ), and the sum of absolute value of i-th row is equal to n a ij j i, a < 1 ii since A is strictly diagonally dominant. This implies T j < 1 and ρ(t j ) < 1. Therefore Jacobi converges for any initial x (0). Gauss-Seidel: T g = (D L) 1 U. Let x be a eigenvector corresponding to an eigenvalue λ of T g, i.e. T g x = λx. Let x i be the element which has the largest value in x. Let ξ = x/x i and it is easy to see that T j ξ = λξ and ξ i = 1, ξ j 1 for j i. This leads to Uξ = λ(d L)ξ and at i th element, we have λ = j>i a ijξ j j i a ijξ j = < 1 j>i a ijξ j a ii + j i a ijξ j j>i a ij ξ j a ii j i a ij ξ j j>i a ij a ii j i a ij This is true for all eigenvalues of T g, thus ρ(t g ) < 1. Is Gauss-seidel better than Jacobi method? Theorem 2 (Stein-Rosenberg) If a ij 0, for each i j and a ii > 0, for each i = 1, 2,, n then one and only of the following statement holds 0 < ρ(t g ) < ρ(t j ) < 1 1 < ρ(t j ) < ρ(t g ) ρ(t j ) = ρ(t g ) = 0 ρ(t j ) = ρ(t g ) = 1 Proof. See Young, D. M. Iterative solution of large linear systems. Comments on this theorem: 9

For some special cases with 0 < ρ(t g ) < ρ(t j ) < 1, that when one method gives convergence, then both give convergence, and the Gauss- Seidel method converges faster than the Jacobi method. If 1 < ρ(t j ) < ρ(t g ) indicates that when one method diverges then both diverge and the divergence is more pronounced for the Gauss-Seidel method. 1.3 Relaxation method 1.3.1 Successive over-relaxation (SOR) method Let A = D L U. The iteration formula for Gauss-Seidel is x (k) i = 1 i 1 [b i a ij x (k) j a ii j=i+1 If we consider the vector x (k) as a whole, we can write Thus T g = (D L) 1 U. x (k) = (D L) 1 Ux (k 1) + (D L) 1 b a ij x (k 1) j ] (1.3) For SOR method, at k-th step, for each i-th element, we compute in two steps: Or in other form a ii x (k) i In matrix form zi k = 1 i 1 [b i a ij x (k) j a ii i 1 + ω x (k) i a ij x (k) j j=i+1 = ωz k i + (1 ω)x (k 1) i = (1 ω)a ii x (k 1) i ω a ij x (k 1) j ]; (1.4) j=i+1 a ij x (k 1) j x (k) = (D ωl) 1 [(1 ω)d + ωu]x (k 1) + ω(d ωl) 1 b + ωb i The iteration matrix T ω = (D ωl) 1 [(1 ω)d+ωu] and c ω = ω(d ωl) 1 b. Example: 4x 1 + 3x 2 = 24 3x 1 + 4x 2 x 3 = 30 x 2 + 4x 3 = 24 10

has the solution (3, 4, 5) t. Compare SOR with ω = 1.25 using x (0) = (1, 1, 1) t and Gauss-Seidel. For the iterates to be accurate to the 7 decimal places, the Gauss-Seidel method requires 34 iterations, as opposed to 14 iterations for SOR. Remark: For SOR, we apply Gauss-Seidel formula for one element i, then do relaxation average. It is different from do Gauss-Seidel for all elements i = 1,, n, then do relaxation for all element, i.e. T ω ωt GS + (1 ω)i Choosing the optimal ω: no complete answer for general linear system. while: If a ii 0 for each i = 1, 2, n then ρ(t ω ) ω 1. this implies that the SOR method can converge only if 0 < ω < 2. Proof. Let λ 1,, λ n are the eigenvalue of T ω. Then det(t ω ) = i (λ i). On the other hand det(t ω ) = det((d ωl) 1 )det((1 ω)d + ωu) = 1/det(D ωl)det((1 ω)d + ωu) = 1/det(D)det((1 ω)d) = (1 ω) n Thus ρ(t ω ) det(t ω ) 1 n = 1 ω. For ρ(t ω ) < 1, we have 0 < ω < 2. If A is positive definite matrix and 0 < ω < 2, then the SOR method converges for any choice of initial approximate vector x (0). Proof. The proof of this theorem can be found in Ortega, J. M., Numerical Analysis; A Second Course, Academic Press, New York, 1972, 201 pp. If A is positive definite and trigiagonal, then ρ(t g ) = [ρ(t j )] 2 < 1 and the optimal ω for SOR is given by 2 ω = 1 + 1 ρ(t g ) and with this choice of ω, ρ(t ω ) = ω 1. Proof. The proof of this theorem can be found in Ortega, J. M., Numerical Analysis; A Second Course, Academic Press, New York, 1972, 201 pp Example A = 4 3 0 3 4 1. We can show A is positive definite and tridigonal, ρ(t j ) = 0.625. Thus ω 0 1 4 1.24. 11

1.4 Error bounds Let x be the unique solution of Ax = b, and x is an estimate. When b Ax is small, we want x x is also small. This is often the case, but certain systems, which occur frequently in practice, fail to have this property. ( ) ( ) 1 2 3 Check the examle: A =, b = has the unique solution 1.0001 2 3.0001 x = (1, 1) t. The poor approxpmation x = (3, 0) t has the residual vector r = b A x = (0, 00002) t, so r = 0.0002 while x x = 2!. Suppose that x is an approximation to the solution of Ax = b, A is a nonsingular matrix, and r = b A x, then for any natural norm and if x 0 and b 0, x x r A 1 x x x A A 1 r b Definition: The condition number of the nonsingular matrix A relative to a norm is defined as thus x x κ(a) r A κ(a) = A A 1 and x x x κ(a) r b. It is easy to see that κ(a) 1. A matrix A is said to be well-conditioned if κ(a) is close to 1, and is ill-conditioned when κ(a) is significantly greater than 1. maxi σi If is 2-norm, then κ(a) = min i σ i where σ i is singular value of A. ( ) ( ) 1 2 10000 10000 For A =, A 1.0001 2 = 3.0001, and A 1 =, thus 5000.5 5000 A 1 = 20000. Thus κ(a) = 60002 1!.If 2-norm is used, κ(a) = 50001 (use matlab command cond(a) ). 1.5 The conjugate gradient method Originally proposed by Hestenes and Stiefel in 1952 as a direct method for solving an n n positive definite system. Performance over Gaussian elimination and the previously discussed iterative methods for positive definite system, often require n steps. Recall inner product: define the inner product of x, y R n as x, y = x t y. The inner product satisfies the following properties 12

1. x, y = y, x ; 2. αx, y = x, αy = α x, y ; 3. x + z, y = x, y + z, y ; 4. x, x 0; 5. x, x = 0 if and only if x = 0. When A is positive definite, x, Ax > 0 unless x = 0. We say that two non-zero vectors x and y are conjugate with respect to A (or A-orthogonal) if x, Ay = 0. It is easy to show that a set of conjugate vectors with respect to a SPD matrix A is linearly independent. Conjugate gradient method: look for a set of conjugate directions v (k), and represent x = k t kv (k), thus Ax = b k t k Av (k) = b t k v (k), Av (k) = v (k), b t k = v(k), b v (k), Av (k) Theorem: Let A be SPD, then x is a solution to Ax = b if and only if x minimizes g(x) = x, Ax 2 x, b. Let v (1) = r (0) = b Ax (0), and find the set of new directions. Theorem: let {v (1 ),, v (n) } be a set of non-zero conjugate direction with respect to SPD matrix A, and let x (0) be arbitrary. Define t k = v(k), b Ax (k 1), x (k) = x (k 1) + t v (k), Av (k) k v (k) for k = 1,, n. Then, assuming exact arithmetic, Ax (n) = b. Theorem: The residual vectors r (k),where k = 1, 2, n, for a conjugate direction method, satisfy the equations r (k), v (j) = 0, for j = 1, 2, k Conjugate gradient method: Let r (0) = b Ax (0), v (1) = r (0) ; and for k = 13

1, 2 n t k = r(k 1), r (k 1) v (k), Av (k) x (k) = x (k 1) + t k v (k 1) r (k) = r (k 1) t k Av (k 1) s k = r (k), r (k) r (k 1), r (k 1) v (k+1) = r (k) + s k v (k) Preconditioned conjugate gradient (PCG) method: consider Ã x = b where Ã = C 1 A(C 1 ) t, x = C t x and b = C 1 b and Ã is better conditioned. PCG merthod often achieves an acceptable solution in n steps and is often used in the solution of large linear systems with sparse and positive definite matrix. The preconditionning matrix C is approximately equal to L in the Choleski facforization. 14