Chapter 2. Solving Systems of Equations. 2.1 Gaussian elimination

Similar documents
Chapter 2. Solving Systems of Equations. 2.1 Gaussian elimination

COURSE Numerical methods for solving linear systems. Practical solving of many problems eventually leads to solving linear systems.

Review of matrices. Let m, n IN. A rectangle of numbers written like A =

Math 471 (Numerical methods) Chapter 3 (second half). System of equations

Iterative Methods. Splitting Methods

Next topics: Solving systems of linear equations

The Solution of Linear Systems AX = B

Today s class. Linear Algebraic Equations LU Decomposition. Numerical Methods, Fall 2011 Lecture 8. Prof. Jinbo Bi CSE, UConn

5.7 Cramer's Rule 1. Using Determinants to Solve Systems Assumes the system of two equations in two unknowns

Direct Methods for Solving Linear Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le

Process Model Formulation and Solution, 3E4

Computational Methods. Systems of Linear Equations

GAUSSIAN ELIMINATION AND LU DECOMPOSITION (SUPPLEMENT FOR MA511)

Algebra C Numerical Linear Algebra Sample Exam Problems

Numerical Methods - Numerical Linear Algebra

2.1 Gaussian Elimination

Linear Algebra Section 2.6 : LU Decomposition Section 2.7 : Permutations and transposes Wednesday, February 13th Math 301 Week #4

Numerical Linear Algebra

Department of Mathematics California State University, Los Angeles Master s Degree Comprehensive Examination in. NUMERICAL ANALYSIS Spring 2015

TMA4125 Matematikk 4N Spring 2017

Numerical Linear Algebra

6. Iterative Methods for Linear Systems. The stepwise approach to the solution...

DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix to upper-triangular

1 GSW Sets of Systems

Chapter 4. Solving Systems of Equations. Chapter 4

30.5. Iterative Methods for Systems of Equations. Introduction. Prerequisites. Learning Outcomes

Direct Methods for Solving Linear Systems. Matrix Factorization

Math/Phys/Engr 428, Math 529/Phys 528 Numerical Methods - Summer Homework 3 Due: Tuesday, July 3, 2018

Linear Algebraic Equations

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Solving Linear Systems

22A-2 SUMMER 2014 LECTURE 5

MTH 464: Computational Linear Algebra

1 Multiply Eq. E i by λ 0: (λe i ) (E i ) 2 Multiply Eq. E j by λ and add to Eq. E i : (E i + λe j ) (E i )

Linear Algebraic Equations

Numerical Analysis: Solutions of System of. Linear Equation. Natasha S. Sharma, PhD

MAC1105-College Algebra. Chapter 5-Systems of Equations & Matrices

Section Gaussian Elimination

Introduction to Mathematical Programming

Lecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University

CHAPTER 6. Direct Methods for Solving Linear Systems

EXAMPLES OF CLASSICAL ITERATIVE METHODS

Example: 2x y + 3z = 1 5y 6z = 0 x + 4z = 7. Definition: Elementary Row Operations. Example: Type I swap rows 1 and 3

AMS 209, Fall 2015 Final Project Type A Numerical Linear Algebra: Gaussian Elimination with Pivoting for Solving Linear Systems

Chapter 1. Root Finding Methods. 1.1 Bisection method

Gaussian Elimination and Back Substitution

CS 323: Numerical Analysis and Computing

LU Factorization. Marco Chiarandini. DM559 Linear and Integer Programming. Department of Mathematics & Computer Science University of Southern Denmark

Chapter 2 Notes, Linear Algebra 5e Lay

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 13

Elementary Linear Algebra

CAAM 454/554: Stationary Iterative Methods

Lab 1: Iterative Methods for Solving Linear Systems

Computational Economics and Finance

A Review of Matrix Analysis

Linear Algebra, Summer 2011, pt. 2

Solving Linear Systems of Equations

MATH 3511 Lecture 1. Solving Linear Systems 1

Numerical Methods I Solving Square Linear Systems: GEM and LU factorization

Numerical Methods Lecture 2 Simultaneous Equations

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Linear System of Equations

Introduction to Systems of Equations

LINEAR SYSTEMS (11) Intensive Computation

Gaussian Elimination without/with Pivoting and Cholesky Decomposition

PowerPoints organized by Dr. Michael R. Gustafson II, Duke University

Math 552 Scientific Computing II Spring SOLUTIONS: Homework Set 1

Numerical Methods Lecture 2 Simultaneous Equations

Scientific Computing

Iterative Methods for Solving A x = b

30.3. LU Decomposition. Introduction. Prerequisites. Learning Outcomes

12/1/2015 LINEAR ALGEBRA PRE-MID ASSIGNMENT ASSIGNED BY: PROF. SULEMAN SUBMITTED BY: M. REHAN ASGHAR BSSE 4 ROLL NO: 15126

JACOBI S ITERATION METHOD

lecture 2 and 3: algorithms for linear algebra

This can be accomplished by left matrix multiplication as follows: I

The purpose of computing is insight, not numbers. Richard Wesley Hamming

Solving Linear Systems

LU Factorization. LU factorization is the most common way of solving linear systems! Ax = b LUx = b

Linear Systems of n equations for n unknowns

Recall, we solved the system below in a previous section. Here, we learn another method. x + 4y = 14 5x + 3y = 2

Y = ax + b. Numerical Applications Least-squares. Start with Self-test 10-1/459. Linear equation. Error function: E = D 2 = (Y - (ax+b)) 2

Here is an example of a block diagonal matrix with Jordan Blocks on the diagonal: J

7.6 The Inverse of a Square Matrix

CLASSICAL ITERATIVE METHODS

Numerical Analysis: Solving Systems of Linear Equations

Chapter 7. Tridiagonal linear systems. Solving tridiagonal systems of equations. and subdiagonal. E.g. a 21 a 22 a A =

Scientific Computing WS 2018/2019. Lecture 9. Jürgen Fuhrmann Lecture 9 Slide 1

MAT 343 Laboratory 3 The LU factorization

Math 5630: Iterative Methods for Systems of Equations Hung Phan, UMass Lowell March 22, 2018

10.2 ITERATIVE METHODS FOR SOLVING LINEAR SYSTEMS. The Jacobi Method

Jordan Journal of Mathematics and Statistics (JJMS) 5(3), 2012, pp A NEW ITERATIVE METHOD FOR SOLVING LINEAR SYSTEMS OF EQUATIONS

CSE 160 Lecture 13. Numerical Linear Algebra

Finite Mathematics Chapter 2. where a, b, c, d, h, and k are real numbers and neither a and b nor c and d are both zero.

Solving Systems of Linear Equations

Matrix decompositions

Lecture 18 Classical Iterative Methods

Matrices and systems of linear equations

Chapter 7 Iterative Techniques in Matrix Algebra

Chapter 3. Linear and Nonlinear Systems

Transcription:

Chapter 2 Solving Systems of Equations A large number of real life applications which are resolved through mathematical modeling will end up taking the form of the following very simple looking matrix system Ax = b (2.) Here A represents a known m n matrix and b a known n vector. The vector x represents the n unknowns. Since a large variety of problems can be transformed into this general formulation a number of methods have been developed which can produce exact or approximate solutions for this problem. For systems where m = n the obvious solution, which is also the simplest, is to find the inverse of the matrix A in order to write the solution as x = A b. One important aspect when performing any numerical computations which we pay particular attention to will be that of the computation cost. Computation cost refers to the number of additions, subtractions, multiplications and divisions that must be performed in the computer in order for us to obtain the desired result. When the size of the matrices is sufficiently large the idea of simply finding the inverse of matrix A (if it exists) is not the most effective way to solve this problem. Computing the inverse has a very high computational cost! Alternatively, you might recall your early algebra classes where you encountered elimination and pivoting methods such as Gaussian elimination and backward substitution or otherwise also called Gauss-Jordan method. These methods require O(n 3 ) operations for matrices of size n n. Thus as the size of the matrix increases the cost in operation skyrockets. Instead, alternate, more effective techniques are used in practice. One common procedure is to produce a factorized version of A. That idea can reduce the operational cost of solving for x from n 3 to n 2. This translates to almost 99% reduction in calculations assuming that the matrices are larger than 00 00 (not unusual for applications nowadays). Unfortunately the operational cost of producing the factors of A in the first place is still in order of n 3 anyway. So overall we have not really gained much... Well that is not true. There is a benefit. For that however you should read further on regarding these methods below. 2. Gaussian elimination We begin by providing an outline for performing Gaussian elimination which you learned in your introductory linear algebra classes. We will subsequently improve this basic algorithm into the more 2

FMN050 Spring 205. Claus Führer and Alexandros Sopasakis page 22 efficient methods which were hinted to above and which we will explain in more detail in the sections below. One key aspect of the method which we need to emphasize is that of numerical stability. We could easily provide a method which performs naive Gaussian elimination and regrettably obtain completely wrong solutions! Numerical stability or lack of it depends on how you are going to perform the required operations in order to preserve as much numerical accuracy as possible. To avoid such numerical issues we must make sure that the largest numbers in a given row are used as denominators in the divisions which must be performed. To achieve this we perform row operations in order to place the largest such elements of each column in the proper position in the augmented matrix. Keep in mind two important points about Gaussian elimination: a) if the matrix A is singular it is not possible to perform the method and b) Gaussian elimination can be applied to any m mn matrix. As a result it is a general method and not limited to just square matrices. Note also that we should perform Gaussian elimination on the augmented matrix which consist of a new m n + matrix which contains all of matrix A with vector b attached at the end. Pseudo-code for Gaussian Elimination into row-echelon form. Main loop in k =, 2,..., m. 2. Find the largest element in absolute value in column k and call it max(k). 3. If max(k) = 0 then stop. The matrix is singular. 4. Swap rows in order to place the row with the largest element for column k in row k. This ensures numerical stability. 5. Do the following for all rows below the pivot. Loop in i = k +, k + 2,..., m. 6. Do the following for all elements in current row. For j = k, k +,..., n. 7. A(i, j) = A(i, j) A(k, j)(a(i, k)/a(k, k)) 8. Fill the remaining lower triangular part of the matrix with zeros A(i, k) = 0. As already discussed in the introduction we are particularly interested in methods which are efficient. In that respect the number of operations performed during the computation is of great interest. In that respect we must count the number of additions, subtractions, multiplications and divisions required in order to completely solve the problem above using Gaussian elimination. Note that the total number of divisions above is n(n )/2. The number of multiplications is n(n )(2n 2)/6. Finally the number of additions/subtractions are n(n )(2n )/6. The overall cost of Gaussian elimination therefore is O(2n 3 /3). The Big O notation is used to imply that the largest term in the total number of operations for this method is 2n 3 /3. Now we must also undertake the task of back-substitution in order to find the actual solution x for this system. This however is a relatively easy computational task. We provide this short pseudocode below as well. We assume for now that we have a system of the form Ux = b where the matrix U is an upper triangular matrix for order m n. Pseudo-code for back-substitution

FMN050 Spring 205. Claus Führer and Alexandros Sopasakis page 23. Main loop for i = m, m,...,. 2. If U(i, i) = 0 then stop. The matrix is singular. 3. Construct b(i) = b(i) n j=i+ U(i, j)x(j) 4. Solve x(i) = b(i)/u(i, i) The number of operations for back-substitution is as follows: n divisions, (n )n/2 multiplications, n(n )/2 additions and subtractions. So clearly the highest operational cost is of order O(n 2 ). Therefore the overall cost of solving the system Ax = b is still in the order of O(n 3 ). Again, we believe that we can improve slightly on the efficiency of our methodology by considering a factorization of the matrix A instead. We do this next. 2.2 LU factorization - Doolittle s version In the next method which we examine now we factor matrix A into two other matrices: a lower triangular matrix L and an upper triangular matrix U such that A = LU The overall idea for solving system Ax = b will be as follows: we start by replacing the matrix A with its factors LU. Thus we can write the system as LUx = b We now define the product Ux to a new variable y. Thus we have, LUx = b becomes Ly = b where y = Ux. Since L is a lower triangular matrix the system Ly = b is almost trivial to solve for the unknown y s. Once we find all the values for y then we can start solving the system Ux = y Note that this system is also very easy to solve since U is an upper triangular matrix. Thus finding x with this method is also very easy. The only thing left to do is to actually compute the lower triangular matrix L and the upper triangular matrix U for which A = LU. This is accomplished by the usual Gaussian elimination method which is applied only up to the point of obtaining an upper triangular matrix (without the back substitution). Let us look at a simple example: Example Solve the following matrix system using an LU factorization. 2 3 3 4 0 x x 2 x 3 = Solution The main part will be to produce the LU factorization. Once this is done then solving the system 0 6

FMN050 Spring 205. Claus Führer and Alexandros Sopasakis page 24 will be easy. To produce the factorization we start with the usual Gaussian elimination method. For ease in notation we denote by R each row of A. Then to create zeros below the element a(, ) we simply do the following 3R + R 2 R2, R + R 3 R 3, 2 3 0 2 8 0 2 2 Last we create zero below a(2, 2) via R 2 + R 3 R 3, 2 3 0 2 8 0 0 6 This procedure remarkably has already produced our required matrices L and U from A. In fact the matrices are, 0 0 2 3 A = LU = 3 0 0 2 8 0 0 6 Do the multiplication to check the result! How did we obtain the matrix L? Note that L is simply the matrix containing all the coefficients by which we multiplied in order to create L through the Gaussian elimination. The diagonal elements of L are always supposed to be, for the Doolittle method, so we do not need to compute that. Let us now revisit the original system Ax = b. Given L and U we can solve easily the original system as follows: first solve LY = B, 0 0 3 0 y y 2 y 3 = Top down you can almost read the solution as, y = 0, y 2 = 6 and y 3 = 5. Now you can solve the second part which is UX = Y for X, 2 3 x 0 UX = 0 2 8 x 2 = 6 0 0 6 x 3 5 This time the solution is read from the bottom up as x = /6, x 2 = /3 and x 3 = 5/6. The following pseudo-code outlines this procedure, Pseudo-code for LU. Input matrix A, and the diagonal elements of L (i.e. ones). 2. Let u(, ) = a(, )/l(, ). If l(, )u(, ) = 0 then LU factorization is not possible and STOP 3. For j = 2,..., n let u(, j) = a(, j)/l() and l(j, ) = a(j, )/u(, ). 4. For i = 2, 3,..., n do 0 6

FMN050 Spring 205. Claus Führer and Alexandros Sopasakis page 25 Let u(i, i) = a(i, i) i l(i, i) k= l(i, k)u(k, i) l(i, i) If l(i, i)u(i, i) = 0 then STOP. Print Factorization is not possible. For j = i +,..., n ( Let u(i, j) = a(i, j) ) i k= l(i, k)u(k, j) /l(i, i) ( Let l(j, i) = a(j, i) ) i k= l(j, k)u(k, i) /u(i, i) 5. Let u(n, n) = a(n, n) n k= l(n, k)u(k, n). If l(n, n)u(n, n) = 0 then The factorization exist A = LU but A is a singular matrix!. 6. Print out all L and U elements. Once you have the factorization then you can solve the matrix system with the following very simple substitution scheme, Pseudo-code for solution of AX = B. First solve LY = B. 2. For i =, 2,..., n do ( y(i) = b(i) ) i j= l(i, j)y(j) /l(i, i) 3. Now solve UX = Y by back substitution in exactly the same way, 4. For i = n, n,..., do ( x(i) = y(i) ) j=n u(i, j)x(j) /u(i, i) There are a couple of results which are interesting since they give us general criteria under which these methods are applicable. The following definition is necessary first, Definition 2.2.. The n n matrix A is said to be strictly diagonally dominant if a(i, i) > n a(i, j) for all i =, 2,..., n i.e row sum. j i The results comes indirectly through Gaussian elimination: Theorem 2.2.2. A strictly diagonally dominant matrix A is non-singular. Furthermore Gaussian elimination can be performed on any linear system of the form Ax = B to obtain its unique solution without row or column interchanges, and the computations are stable with respect to the growth of round-off errors. When can we perform LU decomposition? The following theorem gives the answer,

FMN050 Spring 205. Claus Führer and Alexandros Sopasakis page 26 Theorem 2.2.3. If Gaussian elimination can be performed on the linear system AX = B without row interchanges then the matrix A can be factored into the product of a lower-triangular matrix L and an upper-triangular matrix U, where A = LU. There is another type of factorization which is in fact very similar to this LU or Doolittle s decomposition. The alternate factorization method also produces an LU decomposition where U being a unit upper triangular matrix instead of L. This is called Crout s factorization. Naturally either factorization will do the job and producing one or the other is a matter of taste than anything else. You can change the provided pseudo-code very easily in order to produce such a factorization. LDL T and LL T or Choleski s factorization We continue here by presenting more methods for factoring A. All the techniques presented, similarly to the LU decomposition of A are of the same overall operational cost of O(n 3 ). As the name denotes an LDL T type factorization takes the following form, A = LDL T where L as usual is lower triangular and D is a diagonal matrix with positive entries in the diagonal. Similarly the Choleski factorization A = LL T consists of a lower and upper triangular matrix where neither of which have s in the diagonal (in contrast to either Doolittle s or Crout s factorizations). It is very easy to construct any of the above factorizations once you have an LU decomposition of A. Let us look at the equivalent factorizations for the following matrix 60 30 20 A = 30 20 5 20 5 2 Using our pseudo-code we obtain the following LU decomposition of A, 0 0 60 30 20 A = LU = /2 0 0 5 5 /3 0 0 /3 Now the equivalent LDL T decomposition consist of the following three matrices, 0 0 60 0 0 /2 /3 A = LDL T = /2 0 0 5 0 0 /3 0 0 /3 0 0 Note how the new upper triangular matrix has been obtained by simply dividing each row of the old upper triangular matrix with the respective diagonal element. Now that we have the LDL T factorization we can also easily obtain the equivalent Crout s factorization, A = 60 0 0 30 5 0 20 5 /3 /2 /3 0 0 0

FMN050 Spring 205. Claus Führer and Alexandros Sopasakis page 27 Note here that the new lower triangular matrix is constructed by simply multiplying out matrices L and D. Last the Choleski decomposition is also easily constructed from the LDL T form above by simply dividing the diagonal matrix D into to matrices D = D D and multiplying out L D to produce a lower triangular and DL T to produce an upper triangular matrix A = L D DL T 0 0 60 2 3 60 0 0 = /2 0 0 5 4 0 5 0 /3 0 0 3 3 0 0 3 3 60 0 0 60 2 60 3 60 = 2 60 5 0 0 5 5 3 60 5 3 3 0 0 3 3 /2 /3 0 0 0 This is the LL T form of the matrix A. Let us now look at results regarding when we can perform most of these factorizations. We will first need the following definition, Definition 2.2.4. A matrix A is positive definite if it is symmetric and if x T Ax > 0 for every x 0. Thus based on this definition the following theorem holds, Theorem 2.2.5. If A is an n n positive definite matrix then the following are equivalent, A max a(k, j) k,j n a(i, i) > 0 a 2 (i, j) < a(i, i)a(j, j) is nonsingular max n a(i, i) for all i =, 2,..., n for each i j Recall that one of conditions for a matrix to be nonsingular is that det A 0. Further, Theorem 2.2.6. A matrix A is positive definite if and only if any of the following hold, A = LDL T A = LL T

FMN050 Spring 205. Claus Führer and Alexandros Sopasakis page 28 2.3 Iterative methods for AX = B As the name denotes we will now attempt to solve the system AX = B with an iterative scheme instead of a direct method. The difference is that the solution produced by any of the direct methods presented in the previous section is exact and is determined immediately. In contrast, as is often the case with any iterative scheme, their solutions are obtained after a number of iterations and are not exact but only approximations up to a given tolerance near the true solution. As we will see in the following section iterative techniques are quite useful when the number of equations to be solved is large (i.e. the size of the matrix is large). Furthermore such methods tend to be stable with regards to matrices A with large condition number. As a result small initial errors do not pile up during the iterative process thus blowing up in the end. 2.4 Jacobi, Richardson and Gauss-Seidel methods We start by discovering the Jacobi and Gauss-Seidel iterative methods with a simple example in two dimensions. The general treatment for either method will be presented after the example. The most basic iterative scheme is considered to be the Jacobi iteration. It is based on a very simple idea: solve each row of your system for the diagonal entry. Thus if for instance we wish to solve the following system, [ 4 3 2 5 [ x x 2 = we first solve each row for the diagonal element and obtain [ 5 6 x = 3 4 x 2 + 5 4 x 2 = 2 5 x + 6 5 (2.2) Thus the Jacobi iterative scheme starts with some guess for x and x 2 on the right hand side of this equation and hopefully produces after several iterations improved estimates which approach the true solution x. In matrix form the system above can be written as, [ x m x m 2 = [ 0 3/4 2/5 0 [ x m x m 2 + [ 5/4 6/5 where you can clearly see how the iteration is progressing. Your previous estimate for the solution x m goes in the right hand side and you obtain a new estimate (which is supposed to be better) in the left hand side. Let us examine the output of the Jacobi scheme for a few iterations based on this numerical example. We will assume that the initial guess is taken to be, without loss of generality, x 0 = [0, 0 T n 0 5 0 5 20 Exact Sol. x n 0 2.907500 3.063965 3.07030 3.07404 3.07428574 x n 2 0 2.38000 2.422670 2.428302 2.428557 2.428574285 A very simple but effective improvement has been suggested to the Jacobi scheme. It is the Gauss- Seidel method which simply uses the new value of x in the second row of (2.2). Thus the Gauss-Seidel (2.3)

FMN050 Spring 205. Claus Führer and Alexandros Sopasakis page 29 method for the example above is, [ x m x m 2 = [ 0 3/4 2/5 0 [ x m x m 2 + [ 5/4 6/5 Let us similarly compare a small number of iterates using the Gauss-Seidel method in the following table. n 0 5 0 5 20 Exact Sol. x n 0 3.056675 3.07392 3.07428 3.07428 3.07428574 x n 2 0 2.422670 2.428557 2.42857 2.42857 2.428574285 You may be wondering, rightfully so, whether it really is that simple... In other words whether the method, as outlined above, works all the time. The answer is NO! The reason that things worked out so nicely in the example presented above is that matrix A is diagonally dominant. In fact we provide the relavant theorems in terms of when things are expected to work out for either the Jacobi or the Gauss-Seidel method below (see Theorems 2.4.3 & 2.4.4). Generalization of iterative methods We will now generalize our findings and produce a general theory under which to study iterative schemes. In order to do this we make use of an auxiliary matrix Q to be specified later. The idea relies on what we learned earlier about fixed point problems in one dimension. Let us start by outlining our general set-up. We start as usual from the main system of equations in matrix form (2.4) Ax = B (2.5) where as we have seen before A and B are known while x denotes the vector of the unknowns. First we bring the Ax term to the left hand side 0 = AX + B and then we add an auxiliary vector Qx on both sides of (2.5) and produce the following system Qx = (Q A)x + B (2.6) This new system will be used in order to define our iteration as follows Qx m = (Q A)x m + B First we observe that in fact the solution of (2.6) is simply found from x = Q (Q A)x + Q B = (I Q A)x + Q B. The iterative scheme corresponding to this set-up is clearly recognizable as a Fixed Point Problem in n-dimensions: x m = Gx m + C (2.7) where G = I Q A and C = Q B. The iterative process can now be initiated with a given initial vector x 0 for the solution. This is usually only a guess but if any information is known about the solution should be used in obtaining a better initial such guess for x 0. Given this set-up the only (and most important) thing left to do is choose a matrix Q so that the iterative process outlined in (2.7) will

FMN050 Spring 205. Claus Führer and Alexandros Sopasakis page 30 converge to the true solution x. produce the solution in a small number of iterations We have already seen a couple of iterative methods which did fulfill these tasks in one way or another. Let us look at a more general approach in constructing such iteratives schemes. Suppose that we can write A as, A = D L U where as usual D is diagonal matrix and L, U are lower and upper triangular matrices respectively (with zeros in their diagonal). Then the matrices for each iterative method are given below. Jacobi: if Q = D in (2.7) then G = D (L + U) and C = D B Richardson: if Q = I in (2.7) then G = I A and C = B Gauss-Seidel: if Q = D + L in (2.7) then G = (D + L) U and C = (D + L) B It is important to know when we can expect to have a solution of (2.7). Is it possible to always have a solution of this iterative scheme? The answer is naturally no! We develop below a result which indicates whether we should expect our iteration to be successful or not. We first define what we mean by convergence of an iterative method. Definition 2.4.. An n n matrix A is said to be convergent if lim m Am (i, j) = 0 for i, j =, 2,..., n Further the following holds: Theorem 2.4.2. The following are equivalent: A is a convergent matrix lim m A m = 0 lim m A m x = 0 for every x. ρ(a) < where ρ(a) denotes the spectral radius of the matrix A which is essentially the largest, in absolute value, eigenvalue of A. Then the following theorem gives a very useful result, Theorem 2.4.3. The iterative scheme x m = Gx m + C converges to the unique solution of x = Gx + C for any initial guess x 0 if and only if ρ(g) <. Proof: Subtracting x = Gx + C from x m = Gx m + C we obtain, x m x = G(x m x) Simply taking norms on both sides of the above we have, x m x = G(x m x) G (x m x)

FMN050 Spring 205. Claus Führer and Alexandros Sopasakis page 3 Applying this inequality repeatedly for m, m 2,... we obtain x m x G (x m x) G 2 (x m 2 x). G m (x 0 x) Thus clearly from the above if we assume that ρ(g) based on Therem 2.4.2 we have, and lim m G m = 0 x m x = 0 Thus convergence. We leave the opposite direction of this proof to the reader since it follows from this outline. However this proof is in fact instructive in terms of answering other interesting questions such as how many iterations of the Jacobi iteration are necessary in order for the solution to be found within a given tolerance? Let us look at such an example. Example: Find the number of iterations so that the Jacobi method starting from the vector x 0 = [0, 0 T will reach the solution with a relative error tolerance of 0 4 for the following matrix A, [ 4 3 A = 2 5 Solution: We need to approach the solution x using the Jacobi iteration, x m = Gx m + C (2.8) Suppose then that the solution is x. Then iteration (2.8) has to be satisfied for the solution x as follows, x = Gx + C Subtracting these two equations and taking norms we obtain, Repeating this m more times we obtain, x m x G x m x x m x G m x 0 x Note however that for this problem we have chosen x 0 = [0, 0 T. Thus the above becomes, x m x G m x

FMN050 Spring 205. Claus Führer and Alexandros Sopasakis page 32 or x m x x G m We know from our matrix algebra review in the Appendix that we may employ the spectral radius in order to calculate the Euclidean norm (you may try other norms if you like instead) of G as follows: G 2 = ρ(g T G). Thus using the Euclidean norm everywhere we obtain that the relative error should be x m x 2 x 2 ρ(g T G) m/2 Note that in fact the left hand side is nothing more than the relative error. Therefore in order to find out how many iterations are necessary in order to approach the solution within a relative error tolerance of 0 4 we must solve for m the following equation (ρ(g T G)) m/2 = 0 4 Since ρ(g T G).5625 then G 2 =.75 and most importantly that m = ln 0 4 ln G 2 = 4 ln 0 ln 3/4 = 32.057 Thus if we choose m = 33 we should be within 0 4 of the true solution for this system. Let us now look at some theoretical results for each of the methods presented so far. Theorem 2.4.4. If A is diagonally dominant then the sequence produced by either the Jacobi or the Gauss-Seidel iterations converges to the solution of Ax = B for any starting guess x 0. We outline the proof here only for the Jacobi iteration since the Gauss-Seidel is similar. Proof: Note that the Jacobi iteration matrix G can be written as, G = D (L + U) In that case taking the matrix norm of the above and rearranging we get, G = L + U = max i n j i A(i, j) D max i n A(i, i) where the last inequality holds simply by the definition of A being diagonally dominant. 2.5 Comparisons In terms of speed you should always keep in mind that iterative, direct or other methods always depend on the problem at hand. For instance each iteration using either the Gauss-Seidel or Jacobi method requires about n 2 operations. However if you are solving a small size system of equations then Gaussian elimination is much faster. Take for example a small 3 3 system. If you perform a Jacobi iteration on it you will require about 9 operations per iteration and you may need to perform more than 00 iterations to obtain a very good estimate. Thus a total of at least 900 operations. On the other hand Gaussian elimination only needs to perform 3 3 = 27 operations to solve the

FMN050 Spring 205. Claus Führer and Alexandros Sopasakis page 33 whole system and produce the exact solution! In fact it can be shown that iteration is preferable to Gaussian elimination if ln ɛ ln ρ < n (2.9) 3 Here n corresponds to the size of the matrix A, ρ refers to the spectral radius of the iterative scheme ρ ρ(g) and ɛ is the given relative error tolerance we wish to obtain from the iteration. Let us look at a simple example of this result: Example: As usual we wish to solve the matrix system Ax = B. Suppose that A is a 30 30 matrix and that the spectral radius for the Gauss-Seidel iterative scheme is found to be ρ(g) =.4. Suppose also that we wish to find the solution accurate to within ɛ = 0 5. Is it best to perform the Gauss-Seidel iteration or just simple Gaussian elimination? Solution: Note that for this example ln ɛ ln ρ(g) =.5.9 = 2.56 Therefore inequality (2.9) gives 2.56 < 30 3 = 0 Not true! Thus in this case Gaussian elimination is actually going to be faster! One thing to keep in mind is that in fact there are matrix systems for which one method might converge while the other might not (the reason being that the spectral radius of the iteration is not less than ). Let us outline some important points about these methods and compare them with other techniques: Gauss-Seidel is faster than Jacobi Gauss-Seidel and Jacobi methods have a cost which is about n 2 operations. One iterative scheme may converge to the solution while another may not. This may depend on the choice of initial guess x 0 but more importantly on the spectral radius of the iterative scheme (ρ(g) < ). Gaussian elimination although it costs about n 3 operations may be faster when it comes to moderate size systems. Let us see the pseudo-code for some methods: Jacobi: Suppose that we are provided with a matrix A a vector B and a starting guess vector x 0.. For i = to n do Y (i) = x 0 (i) 2. While a given tolerance ɛ is satisfied do the following: For i = to n do Z(i) = B(i) i j= A(i, j)y (j) n j=i+ A(i, j)y (j) A(i, i)

FMN050 Spring 205. Claus Führer and Alexandros Sopasakis page 34 For i = to n do Y (i) = Z(i). 3. Print out the vector Z. Gauss-Seidel: Suppose that we are provided with a matrix A a vector B and a starting guess vector x 0.. For i = to n do Y (i) = x 0 (i) 2. While a given tolerance ɛ is satisfied do the following: For i = to n do the following two steps: 3. Print out the vector Z. Z(i) = B(i) i j= A(i, j)y (j) n j=i+ A(i, j)y (j) A(i, i) Y (i) = Z(i) However we can in fact come up with methods which can converge to the solution under appropriate conditions even faster (see SOR and SSOR methods for instance).