MATH 5330: Computationa Methods of Linear Agebra Lecture Note 3: Stationary Iterative Methods Xianyi Zeng Department of Mathematica Sciences, UTEP Stationary Iterative Methods The Gaussian eimination (or in genera most direct methods) requires O(n 3 ) computationa cost, which is not acceptabe when n is arge For exampe, et us consider the direct numerica simuation of the Navier-Stoes equation on a unit cube in 3D; we discretize the domain with N 3 cubes and compute the soutions up to t = T The Navier-Stoes equation has five variabes at each node, hence the soution vector is u R 5N 3 When an impicit/expicit (IMEX) method is used for the time-integration, the time step size scaes with /N and the tota number of time steps is O(N) For each time step, there is a noninear equation to sove: f(u n+ ) = g(u n ), to update the soution from one time step (u n ) to the next (u n+ ) Let K be the average number of iterations in the Newton method to sove this noninear system, in tota we need to sove a inear system O(KN) times, and each such inear system is of the size 5N 3 5N 3 If a direct method is used for the inear soves, the tota computationa cost is: O((5N 3 ) 3 ) O(KN) = O(KN 0 ), which is prohibitive even for moderate vaue of N The target of stationary iterative methods is to reduce the computationa cost with inear soves to magnitudes smaer than O(n 3 ) Particuary in soving Ax = b, et us write A = M +(A M) for some matrix M and the inear system as: Mx = (A M)x+b () In an iterative method, we start with an initia guess x 0 and try to improve the resut soving for x +, = 0,, : Mx + = (A M)x +b, or equivaenty x + = M (A M)x +M b () Let s oo at the ast equation, ceary a requirement for the iterative method to mae sense is that the inear system associated with M shoud be easy to sove in the sense that the cost is no more than O(n ) Two such choices are diagona matrices ( O(n)) and trianguar matrices ( O(n )) Next, we aso want to mae sure that if there exists a soution x, then x x as + Finay, providing that x x, we hope x x to be reasonaby sma for ony a few number of iterations These are the questions we d ie to answer for any iterative method in this ecture Or simpy iterative methods in this section
Let A be non-singuar and x soves Ax = b, we first oo at the convergence Define ε = x x as the error vector in the -th iteration, then: or equivaenty: Mε + = M(x + x) = [ (A M)x +b] [ (A M)x+b] = (A M)ε, ε + = Gε, G = M (A M) (3) The growth matrix G remains the same for a iterations (hence the name stationary iterative methods ), thus we obtain an estimate on the error ε : ε = G ε 0 = ε G ε 0 (4) Thus we have ε 0 for any ε 0 if G 0 when, for which a sufficient and necessary condition is given by Theorem Remar Stricty speaing, we do not need G 0 to deduce ε if we can choose ε 0 carefuy In an extreme case if ε 0 is in the nu space of some G 0 (which happens with 0 = in the fooish case when M = A), the error becomes zero after 0 iterations Theorem G 0 as if and ony if ρ(g) <, where the spectra radius ρ(g) is the maximum absoute vaue of a the eigenvaues of G Proof We consider the Jordan canonica form G=QJQ, where Q is invertibe and J is given by: J λ J J =, J = λ =,,m (5) Jm λ n n Here λ,,λ m are the eigenvaues of G, which may not be different from each other; and n + + n m = n To this end G =QJ Q and we just need to show J 0 if and ony if ρ(g)< or equivaenty λ < for a Because J = diag(j, J,, J m), we ony need to show J 0 if and ony if λ < The ast point is straightforward as: n n ( ( ) λ ) ( λ ) ( ) λ n + λ n + ( ( J ) λ ) ( λ ) n λ n ( = ) ( ) λ n λ n ( ) λ and the fact that each entry converges to zero as n n, (6) In fact, if G = M (A M) has spectra radius smaer than, we can drop the assumption that A is non-singuar (it remains true, though), as stated by the next theorem Theorem If ρ(g) <, then A is invertibe
Proof Because A=M(I G), we just need to show that I G is invertibe for the first part Indeed, if (I G)x = 0 for some vector x R n, then x = Ix = Gx = G x = = G x 0 by Theorem ; hence the nu space of I G contains ony the zero vector and I G is invertibe The condition of the preceding theorem is in genera very difficut to chec; and a more convenient one is based on the inequaity G G Hence a necessary condition is given by G < for some induced matrix norm Jacobi Method A simpest iterative method is given by choosing M as the diagona part of A, this is caed the Jacobi method Let A=L+D+U, where L, D, and U are the ower-tridiagona part, diagona part, and upper-tridiagona part, respectivey (not to be confused with the LU or LDU decomposition!) Then in the Jacobi method, M = D and () reduces to: x + = D (L+U)x +D b () The growth matrix G = D (L+U) is: G = a a a nn 0 a a n a 0 a n = a n a n 0 a n a nn a a n a n a 0 a a a 0 a n a nn 0 () If the diagona eements of A are sufficienty arge, say: a ii > a ij, i =,,, n, (3) j i then we have G < By the argument at the end of Section, we see that the Jacobi method converges if A is diagonay dominant, ie, if A satisfies (3) (3) actuay provides a way to estimate the number of iterations needed in order to achieve certain accuracy Let: then by (4) ε ρ jac ε0 ρ jac = max i j i a ij a ii <, (4) To see an exampe of diagona-dominate inear systems, we consider the impicit method to sove the system of ordinary differentia equations: dx dt = f(x) 3
Let x m and x m+ be the soutions at t m and t m+ = t m + t m, respectivey; then we update the soution from x m to x m+ by: x m+ x m t m = f(x m )+ f(x m) x (x m+ x m ), f(x which invoves the inear system with A m = I + t m) m x Thus we can aways find a diagonadominate matrix A m by choosing a sma t m 3 Gauss-Seide Method The Jacobi method can be written component-by-component as: for i =,,, n : (3) x +;i = ij x ;j +b j j ia a ii Here we denote x = [x ;i ] One argument about the Jacobi method is that we re not using the most updated information in the soution, ie it is aways the components of x that appear on the right hand side of the updating formua A modification that aways use the most recent data and saves some storage is the foowing: for i =,,, n : (3) x +;i = a ij x +;j ij x ;j +b j j>ia a ii j<i This agorithm is nown as the Gauss-Seide method; and it is equivaent to the choice M = D +L: x + = (D +L) (Ux b) (33) The Gauss-Seide method aso guarantees convergence for arbitrary initia data for diagonay dominant matrices Particuary, simiar to (4) we can derive a decay rate: j>i a ij ρ gs = max i a ii j<i a ij (34) We prove in the exercises (D +L) U ρ gs and hence deduce that ε ρ gs ε 0 Comparing (4) and (34) we see for the same diagonay dominant matrix A, ρ gs ρ jac ; hence the Gauss-Seide method in genera converges faster than the Jacobi method, at the cost of soving a trianguar system instead of a diagona one at each iteration Because soving the inear system L + D invoves forward substitution, the method described before is aso caed the forward Gauss-Seide method Simiary, we can choose M = D + U and estabishing simiar convergent resut for diagonay dominant matrices; and this method is caed the bacward Gauss-Seide method Finay, we show the convergence of the forward Gauss-Seide method for another type of very important matrices, namey the symmetric positive-definite ones This is a direct resut of the foowing theorem and Theorem 4
Theorem 3 Let A be symmetric positive-definite, then G = (L+D) L t satisfies ρ(g) < Proof Ceary D has a its diagona entries positive and it is non-singuar; thus we may write: G = [D (D LD +I)D ] L t = D ( L+I) Lt D, where L = D LD Let λ C be any eigenvaue of G, since: ( L+I) Lt = D GD, λ is aso an eigenvaue of G=( L+I) Lt Choose z C n as a unit eigenvector of G corresponding to λ, ie L t z = λ( L+I)z Define à = L+I + L t = D (L+D +L t )D = D AD, then à is aso symmetric positivedefinite Let us define α = z Lz and denote α = a + ib, a,b R; we aso denote z = x + iy where x,y R n Then we have: where we used the assumption z z = Next, we compute z Ãz: α = α = z Lt z = λz ( L+I)z = λ(+α), z Ãz = z ( L+I + L t )z = (+λ)z ( L+I)z = (+λ)(+α) = +α+α = +a However, by the positive definiteness of Ã, we have: Thus by λ = α/(+α) we have: z Ãz = (x t iy t )Ã(x+iy) = xt Ãx+y t Ãy > 0 +a > 0 λ = a ib +a+ib = a +b +a+a +b < The proof is competed by noting that ρ(g) = ρ( G) and the choice of λ is arbitrary 4 SOR Method The successive over-reaxation method (SOR) taes a inear combination of the Jacobi method and the Gauss-Seide method to provide more contro over the convergence rate Particuary, we choose M = M ω = ω D +L for some ω > 0 Letting ω = the method tens to the Jacobi method and setting ω = the method corresponds to the forward Gauss-Seide For the SOR with ω > 0, we have: ( ) ( ) ω G = G ω = ω D +L ω D +U It can be shown that if A is symmetric positive-definite and 0 < ω <, then the SOR method converges The proof is simiar to that of Theorem 3, and the detais are eft as an exercise 5
Note that due to a theorem of by Kahan [], ρ(g ω ) ω ; hence a necessary condition for the SOR method to converge is aso 0 < ω < A major purpose of the SOR method is that it aows peope to tune ω in order to minimize ρ(g ω ) for some specia matrices For exampe, if A is symmetric positive-definite and aso tridiagona, then ρ(g gs ) = ρ(g jac ) < and the optima choice for SOR is: ω = + ρ(g jac ) In this case, ρ(g ω ) = ω, which is optima by the Kahan theorem 5 Reated Topics: Acceeration and Preconditioners The acceeration technique tries to improve the convergence of an existing iterative method Suppose we obtained x, x,, x from the standard iterative method, then the pan is to compute a inear combination: y = ν i ()x i, (5) i=0 so that y represents a better approximation to the exact soution Note that a natura condition on the coefficients is i=0 ν i()=, so that if a iterates are exact, so is y If we define a poynomia: and extend its definition to matrices naturay, we have: p (x) = ν 0 ()+ν ()x+ +ν ()x (5) y x = p (G)ε 0, where x is the exact soution Hence the target is to minimize p (G) in a certain norm The Chebyshev semi-iterative method for symmetric matrices maes use of the fact that the eigenvaues of p (G) are p (λ), where λ is any eigenvaue of G Knowing that any eigenvaue of G ies between and, the method utiizes the Chebyshev poynomias c (x) defined recursivey by c 0 (x) =, c (x) = x, and c + (x) = xc (x) c (x), and define: λ min p (x) = c (µ) c ( + x λ min ), λ max λ min where µ = + λ max λ min, < λ min < λ max < are the smaest and argest eigenvaues of G, respectivey There are two benefits of using the Chebyshev poynomias First, c (x) satisfies c (x) on [,] and it grow rapidy off this interva; thus the foowing estimate expects to be sma: y x c (µ) ε 0 Secondy, the recursive reation in the Chebyshev poynomias enabes the foowing agorithm that competey removes the need to compute the iterates x but cacuate y directy: y + = ω + (y y +γz )+y, Mz = b Ay, 6
where γ = λ min λ max, ω + = λ min λ max λ max λ min c (µ) c + (µ) Another use of the iterative methods is to construct preconditioners A (eft) preconditioner P modifies the origina equation Ax = b to: P Ax = P b (53) In genera, the preconditioner depends highy on the probems to be soved, such as the ow-mach preconditioner for ow-speed aerodynamic probems From a pure inear agebra point of view, though, the iterative method provides a cass of preconditioners given by P = M In this case, we sti need to sove a non-trivia system with M A, but the hope is that M A wi be better conditioned than A itsef Corresponding to the previous methods, we have the foowing preconditioners: Exercises P jac = D, P gs = (L+D), P sor = (L+ω D) Exercise Use mathematica induction to show (6) in the case n = 3 Exercise Let A be diagonay dominant and we want to compete the proof that the Gauss- Seide method converges Particuary et G = (L+D) U, show that G ρ gs, which is given by (34) Hint: Use the definition of induced matrix norms, we just need to show that for a x =, there is y ρ gs, where y = Gx And for this purpose, use Dy = Ly Ux Exercise 3 Show that if A is symmetric, diagonay dominant, and a its diagona eements are positive, then A is positive definite Hint: Show that x t Ax 0 and derive the condition for the equaity to hod For this purpose, use the inequaity a ij x i y j ( a ij x i + ) a ji x j Exercise 4 Prove that if A is symmetric positive definite, then the SOR method with 0 < ω < converges Exercise 5 Let us consider the SOR method with ω > 0, show that: detg ω = ω n Then deduce that ρ(g ω ) ω, the Kahan theorem Hint: The determinant of a matrix A is the product of a the eigenvaues of A References [] Wiiam Morton Kahan Gauss-Seide methods of soving arge systems of inear equations PhD thesis, University of Toronto, 958 7