COURSE Iterative methods for solving linear systems

COURSE 0 4.3. Iterative methods for solving linear systems Because of round-off errors, direct methods become less efficient than iterative methods for large systems (>00 000 variables). An iterative scheme for linear systems consists of converting the system to the form Ax = b () x = b Bx. After an initial guess for x (0), the sequence of approximations of the solution x (0), x (),..., x (k),...is generated by computing x (k) = b Bx (k ), for k =,, 3,...

4.3.. Jacobi iterative method Consider the n n linear system, a x + a x +... + a n x n = b a x + a x +... + a n x n = b... a n x + a n x +... + a nn x n = b n, where we assume that the diagonal terms a, a,..., a nn are all nonzero. We begin our iterative scheme by solving each equation for one of the variables: x = u x +... + u n x n + c x = u x +... + u n x n + c... x n = u n x +... + u nn x n + c n, where u ij = a ij a ii, c i = b i a ii, i =,..., n.

Let x (0) = (x (0), x(0),..., x(0) n ) be an initial approximation of the solution. The k + -th approximation is: for k = 0,,,... x (k+) = u x (k) +... + u nx (k) n + c x (k+) = u x (k) + u 3x (k)... x (k+) n An algorithmic form: x (k) i = b i 3 +... + u nx n (k) + c = u n x (k) +... + u nn x (k) n + c n, n j=,j i a ij x (k ) j a ii, i =,,..., n, for k. The iterative process is terminated when a convergence criterion is satisfied. Stopping criterions: x (k) x (k ) x (k) x (k ) < ε or x (k) < ε, with ε > 0 - a prescribed tolerance.

Example Solve the following system using the Jacobi iterative method. Use ε = 0 3 and x (0) = (0 0 0 0) as the starting vector. 7x x + x 3 = 7 x 9x + 3x 3 x 4 = 3 x + 0x 3 + x 4 = 5 x x + x 3 + 6x 4 = 0. These equations can be rearranged to give and, for example, x = (7 + x x 3 )/7 x = ( 3 + x + 3x 3 x 4 )/9 x 3 = (5 x x 4 )/0 x 4 = (0 x + x x 3 )/6 x () = (7 + x (0) x (0) 3 )/7 x () = ( 3 + x (0) + 3x (0) 3 x (0) 4 )/9 x () 3 = (5 x (0) x (0) 4 )/0 x () 4 = (0 x (0) + x (0) x (0) 3 )/6.

Substitute x (0) = (0, 0, 0, 0) into the right-hand side of each of these equations to get x () = (7 + 0 0)/7 =.48 57 49 x () = ( 3 + 0 + 3 0 0)/9 =.444 444 444 x () 3 = (5 0 0)/0 =.5 x () 4 = (0 0 + 0 0)/6 =.666 666 667 and so x () = (.48 57 49,.444 444 444,.5,.666 666 667). The Jacobi iterative process: x (k+) = x (k+) = x (k+) 3 = x (k+) 4 = ( ( ( 7 + x (k) 3 + x (k) x(k) 3 ) /7 + 3x(k) 3 x(k) 4 ) 5 x (k) x(k) 4 /0 ( 0 x (k) + x(k) x(k) 3 We obtain a sequence that converges to ) ) /9 /6, k. x (9) = (.000703,.000006,.0008096,.00067).

4.3.. Gauss-Seidel iterative method Almost the same as Jacobi method, except that each x-value is improved using the most recent approx. of the other variables. For a n n system, the k + -th approximation is: x (k+) = u x (k) +... + u nx (k) n + c x (k+) = u x (k+) + u 3 x (k)... x (k+) n 3 +... + u nx n (k) + c = u n x (k+) +... + u nn x (k+) n + c n, with k = 0,,,...; u ij = a ij a, c ii i = method). Algorithmic form: b i a ii, i =,..., n (as in Jacobi x (k) i = b i i j= a ij x (k) j n j=i+ a ii a ij x (k ) j

for each i =,,...n, and for k. Stopping criterions: prescribed tolerance, ε > 0. x (k) x (k ) x (k) x (k ) < ε, or x (k) < ε, with ε - a Remark Because the new values can be immediately stored in the location that held the old values, the storage requirements for x with the Gauss-Seidel method is half than that for Jacobi method and the rate of convergence is faster. Example 3 Solve the following system using the Gauss-Seidel iterative method. Use ε = 0 3 and x (0) = (0 0 0 0) as the starting vector. 7x x + x 3 = 7 x 9x + 3x 3 x 4 = 3 x + 0x 3 + x 4 = 5 x x + x 3 + 6x 4 = 0

We have x = (7 + x x 3 )/7 x = ( 3 + x + 3x 3 x 4 )/9 x 3 = (5 x x 4 )/0 x 4 = (0 x + x x 3 )/6, and, for example, x () = (7 + x (0) x (0) 3 )/7 x () = ( 3 + x () + 3x (0) 3 x (0) 4 )/9 x () 3 = (5 x () x (0) 4 )/0 x () 4 = (0 x () + x () x () 3 )/6,

which provide the following Gauss-Seidel iterative process: x (k+) = x (k+) = x (k+) 3 = x (k+) 4 = ( ( ( 7 + x (k) x(k) 3 ) /7 3 + x (k+) + 3x (k) 3 x(k) 4 ) 5 x (k+) x (k) 4 /0 ( 0 x (k+) + x (k+) x (k+) 3 ) /9 ) /6, for k. Substitute x (0) = (0, 0, 0, 0) into the right-hand side of each of these equations to get x () = (7 + 0 0)/7 =.48 57 49 x () = ( 3 +.48 57 49 + 3 0 0)/9 =.74603746 x () 3 = (5.48 57 49 0)/0 =.0485743 x () 4 = (0.48 57 49.74603746.0485743)/6 = 0.897089947

and so x () = (.485749.74603746,.0485743, 0.897089947). Similar procedure generates a sequence that converges to x (5) = (.00005,.00030,.00000.0.99997).

4.3.3. Relaxation method In case of convergence, the Gauss-Seidel method is faster than Jacobi method. The convergence can be more improved using relaxation method (SOR method) (SOR=Succesive Over Relaxation) Algorithmic form of the method: x (k) i = ω a ii ( b i i a ij x (k) j= j n j=i+ for each i =,,...n, and for k. a ij x (k ) j ) + ( ω)x (k ) i For 0 < ω < the procedure is called under relaxation method, that can be used to obtain convergence for systems which are not convergent by Gauss-Siedel method. For ω > the procedure is called over relaxation method, that can be used to accelerate the convergence for systems which are convergent by Gauss-Siedel method.

By Kahan s Theorem follows that the method converges for 0 < ω <. Remark 4 For ω =, relaxation method is Gauss-Seidel method. Example 5 Solve the following system, using relaxation iterative method. Use ε = 0 3, x (0) = ( ) and ω =.5, We have 4x + 3x = 4 3x + 4x x 3 = 30 x + 4x 3 = 4 x (k) = 7.5 0.937x (k ) 0.5x (k ) x (k) = 9.375 9.375x (k) + 0.35x(k ) 3 0.5x (k ) x (k) 3 = 7.5 + 0.35x (k) 0.5x(k ) 3, for k. The solution is (3, 4, 5).

4.3.4 The matriceal formulations of the iterative methods Split the matrix A into the sum A = D + L + U, where D is the diagonal of A, L the lower triangular part of A, and U the upper triangular part of A. That is, D = U = a 0 0 0.............. 0 0 0 a nn 0 a a n........... a n,n 0 0, L = The system Ax = b can be written as (D + L + U)x = b. 0 0 a........... a n a n,n 0,

The Jacobi method in matriceal form is given by: Dx (k) = (L + U)x (k ) + b the Gauss-Seidel method in matriceal form is given by: (D + L)x (k) = Ux (k ) + b and the relaxation method in matriceal form is given by: (D + ωl)x (k) = (( ω)d ωu)x (k ) + ωb Convergence of the iterative methods Remark 6 The convergence (or divergence) of the iterative process in the Jacobi and Gauss-Seidel methods does not depend on the initial guess, but depends only on the character of the matrices themselves. However, a good first guess in case of convergence will make for a relatively small number of iterations.

A sufficient condition for convergence: Theorem 7 (Convergence Theorem) If A is strictly diagonally dominant, then the Jacobi, Gauss-Seidel and relaxation methods converge for any choice of the starting vector x (0). Example 8 Consider the system of equations 3 4 0 6 x x x 3 = The coefficient matrix of the system is strictly diagonally dominant since 4 a = 3 = 3 > + =. a = 4 = 4 > + 0 = a 33 = 6 = 6 > + = 3. Hence, if the Jacobi or Gauss-Seidel method are used to solve the system of equations, they will converge for any choice of the starting vector x (0).

Example 9 Consider the linear system 4x + x = 3 x + 5x =. a) Perform two iterations of the Jacobi method to this system, beginning with the vector x = [3, ]. b) Perform two iterations of the Gauss-Seidel method to this system, beginning with the vector x = [3, ]. (Solutions of the system are 7/9 and /9).

5. Numerical methods for solving nonlinear equations in R Let f : Ω R, Ω R. Consider the equation f (x) = 0, x Ω. () We attach a mapping F : D D, D Ω n to this equation. Let (x 0,..., x n ) D. Using F and the numbers x 0, x,..., x n we construct iteratively the sequence with x 0, x,..., x n, x n,... (3) x i = F ( x i n,..., x i ), i = n,.... (4) The problem consists in choosing F and x 0,..., x n D such that the sequence (3) to be convergent to the solution of the equation ().

Definition 0 The procedure of approximation the solution of equation () by the elements of the sequence (3), computed as in (4), is called F -method. The numbers x 0, x,..., x n are called the starting points and the k-th element of the sequence (3) is called an approximation of k-th order of the solution. If the set of starting points has only one element then the F -method is an one-step method; if it has more than one element then the F -method is a multistep method. Definition If the sequence (3) converges to the solution of the equation () then the F -method is convergent, otherwise it is divergent.

Definition Let α Ω be a solution of the equation () and let x 0, x,..., x n, x n,... be the sequence generated by a given F -method. The number p having the property lim x i α α F (x i n+,..., x i ) (α x i ) p = C 0, C = constant, is called the order of the F -method. We construct some classes of F -methods based on the interpolation procedures. Let α Ω be a solution of the equation () and V (α) a neighborhood of α. Assume that f has inverse on V (α) and denote g := f. Since it follows that f (α) = 0 α = g (0). This way, the approximation of the solution α is reduced to the approximation of g (0).

Definition 3 The approximation of g by means of an interpolating method, and of α by the value of g at the point zero is called the inverse interpolation procedure. 5.. One-step methods Let F be a one-step method, i.e., for a given x i we have x i+ = F (x i ). Remark 4 If p = the convergence condition is F (x) <. If p > there always exists a neighborhood of α where the F -method converges. All information on f are given at a single point, the starting value we are lead to Taylor interpolation. Theorem 5 Let α be a solution of equation (), V (α) a neighborhood of α, x, x i V (α), f fulfills the necesary continuity conditions.

Then we have the following method, denoted by Fm T, for approximating α: where g = f. F T m(x i ) = x i + m k= ( ) k k! [f(x i )] k g (k) (f(x i )), (5) Proof. There exists g = f C m [V (0)]. Let y i = f (x i ) and consider Taylor interpolation formula with g (y) = ( T m g ) (y) + ( R m g ) (y), ( Tm g ) m (y) = k=0 and R m g is the corresponding remainder. Since α = g (0) and g T m g, it follows k! (y y i) k g (k) (y i ), α ( T m g ) (0) = x i + m k= ( ) k k! y k i g(k) (y i ).

Hence, x i+ := F T m (x i ) = x i + m k= ( ) k k! [f(x i )] k g (k) (f(x i )) is an approximation of α, and F T m is an approximation method for the solution α. Concerning the order of the method F T m we state: Theorem 6 If g = f satisfies condition g (m) (0) 0, then ord(f T m) = m. Remark 7 We have an upper bound for the absolute error in approximating α by x i+ : α Fm T (x i ) m! [f (x i)] m M m g, with M m g = sup y V (0) g (m) (y).

Particular cases. ) Case m =. F T (x i ) = x i f(x i) f (x i ). This method is called Newton s method (the tangent method). Its order is. ) Case m = 3. F T 3 (x i ) = x i f(x i) f (x i ) [ ] f(xi ) f (x i ) f (x i ) f (x i ), with ord(f T 3 ) = 3. So, this method converges faster than F T. 3) Case m = 4. F4 T (x i) = x i f(x i) f (x i ) f (x i )f (x i ) [f (x i )] 3 + ( f (x i )f (x i ) 3[f (x i )] ) f 3 (x i ) 3![f (x i )] 5.

Remark 8 The higher the order of a method is, the faster the method converges. Still, this doesn t mean that a higher order method is more efficient (computation requirements). By the contrary, the most efficient are the methods of relatively low order, due to their low complexity (methods F T and F T 3 ). Newton s method According to Remark 4, there always exists a neighborhood of α where the F method is convergent. Choosing x 0 in such a neighborhood allows approximating α by terms of the sequence with a prescribed error ε. x i+ = F T (x i ) = x i f(x i) f, i = 0,,..., (x i ) If α is a solution of equation () and x n+ = F T (x n), for approximation error, Remark 7 gives α x n+ [f (x n )] M g.

Lemma 9 Let α (a, b) be a solution of equation () and let x n = ( ) xn. Then F T α x n m f (x n ), with m m f = min a x b f (x). Proof. We use the mean formula f (α) f (x n ) = f (ξ) (α x n ), with ξ to the interval determined by α and x n. From f (α) = 0 and f (x) m for x (a, b), it follows f (x n ) m α x n, that is α x n m f (x n ). In practical applications the following evaluation is more useful:

Lemma 0 If f C [a, b] and F T n 0 N such that is convergent, then there exists x n α x n x n, n > n 0. Remark The starting value is chosen randomly. If, after a fixed number of iterations the required precision is not achieved, i.e., condition x n x n ε, does not hold for a prescribed positive ε, the computation has to be started over with a new starting value. A modified form of Newton s method: computation of f : x k+ = x k f (x k) f, k = 0,,.... (x 0 ) - the same value during the It is very useful because it doesn t request the computation of f at x j, j =,,... but the order is no longer equal to.

Another way for obtaining Newton s method. We start with x 0 as an initial guess, sufficiently close to the α. Next approximation x is the point at which the tangent line to f at (x 0, f(x 0 )) crosses the Ox-axis. The value x is much closer to the root α than x 0. We write the equation of the tangent line at (x 0, f(x 0 )) : y f(x 0 ) = f (x 0 )(x x 0 ). If x = x is the point where this line intersects the Ox-axis, then y = 0 and solving for x gives f(x 0 ) = f (x 0 )(x x 0 ), x = x 0 f(x 0) f (x 0 ). By repeating the process using the tangent line at (x, f(x )), we obtain for x x = x f(x ) f (x )

For the general case we have x n+ = x n f(x n) f, n 0. (6) (x n ) α x x x 0 The algorithm: Let x 0 be the initial approximation. for n = 0,,..., IT MAX x n+ x n f(x n) f (x n ).

A stopping criterion is: f(x n ) ε or where ε is a specified tolerance value. xn+ x n x n+ x n ε or ε, x n+ Example Use Newton s method to compute a root of x 3 x = 0, to an accuracy of 0 4. Use x 0 =.

Sol. The derivative of f is f (x) = 3x x. Using x 0 = gives f() = and f () = and so the first Newton s iterate is x = = and f() = 3, f () = 8. The next iterate is x = 3 8 =.65. Continuing in this manner we obtain the sequence of approximations which converges to.46557.