Math 471 (Numerical methods) Chapter 3 (second half). System of equations

Similar documents
Chapter 2. Solving Systems of Equations. 2.1 Gaussian elimination

Iterative Methods. Splitting Methods

The Solution of Linear Systems AX = B

6. Iterative Methods for Linear Systems. The stepwise approach to the solution...

CLASSICAL ITERATIVE METHODS

DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix to upper-triangular

Math/Phys/Engr 428, Math 529/Phys 528 Numerical Methods - Summer Homework 3 Due: Tuesday, July 3, 2018

CAAM 454/554: Stationary Iterative Methods

Numerical Methods - Numerical Linear Algebra

Today s class. Linear Algebraic Equations LU Decomposition. Numerical Methods, Fall 2011 Lecture 8. Prof. Jinbo Bi CSE, UConn

Direct Methods for Solving Linear Systems. Simon Fraser University Surrey Campus MACM 316 Spring 2005 Instructor: Ha Le

Gaussian Elimination and Back Substitution

Review of matrices. Let m, n IN. A rectangle of numbers written like A =

Numerical Analysis: Solving Systems of Linear Equations

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

9. Iterative Methods for Large Linear Systems

Sparse Linear Systems. Iterative Methods for Sparse Linear Systems. Motivation for Studying Sparse Linear Systems. Partial Differential Equations

Chapter 2. Solving Systems of Equations. 2.1 Gaussian elimination

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

5.7 Cramer's Rule 1. Using Determinants to Solve Systems Assumes the system of two equations in two unknowns

Scientific Computing WS 2018/2019. Lecture 9. Jürgen Fuhrmann Lecture 9 Slide 1

Numerical Methods I Solving Square Linear Systems: GEM and LU factorization

Example: 2x y + 3z = 1 5y 6z = 0 x + 4z = 7. Definition: Elementary Row Operations. Example: Type I swap rows 1 and 3

LINEAR SYSTEMS (11) Intensive Computation

Next topics: Solving systems of linear equations

Solving linear systems (6 lectures)

Numerical Methods I Non-Square and Sparse Linear Systems

Computational Methods. Systems of Linear Equations

PowerPoints organized by Dr. Michael R. Gustafson II, Duke University

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Jordan Journal of Mathematics and Statistics (JJMS) 5(3), 2012, pp A NEW ITERATIVE METHOD FOR SOLVING LINEAR SYSTEMS OF EQUATIONS

GAUSSIAN ELIMINATION AND LU DECOMPOSITION (SUPPLEMENT FOR MA511)

Elementary Linear Algebra

EXAMPLES OF CLASSICAL ITERATIVE METHODS

Process Model Formulation and Solution, 3E4

Chapter 7 Iterative Techniques in Matrix Algebra

Linear Algebra. Carleton DeTar February 27, 2017

Iterative Methods for Solving A x = b

SOLVING LINEAR SYSTEMS

Lecture 18 Classical Iterative Methods

1 GSW Sets of Systems

Math 471 (Numerical methods) Chapter 4. Eigenvalues and Eigenvectors

Scientific Computing

Linear Algebra Section 2.6 : LU Decomposition Section 2.7 : Permutations and transposes Wednesday, February 13th Math 301 Week #4

COURSE Numerical methods for solving linear systems. Practical solving of many problems eventually leads to solving linear systems.

MTH 464: Computational Linear Algebra

Lab 1: Iterative Methods for Solving Linear Systems

LU Factorization. Marco Chiarandini. DM559 Linear and Integer Programming. Department of Mathematics & Computer Science University of Southern Denmark

Solution of Linear Equations

G1110 & 852G1 Numerical Linear Algebra

Solving Linear Systems of Equations

AMS 209, Fall 2015 Final Project Type A Numerical Linear Algebra: Gaussian Elimination with Pivoting for Solving Linear Systems

Kasetsart University Workshop. Multigrid methods: An introduction

Background. Background. C. T. Kelley NC State University tim C. T. Kelley Background NCSU, Spring / 58

9.1 Preconditioned Krylov Subspace Methods

Lemma 8: Suppose the N by N matrix A has the following block upper triangular form:

Notes for CS542G (Iterative Solvers for Linear Systems)

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 13

CS 323: Numerical Analysis and Computing

Computational Economics and Finance

7.2 Linear equation systems. 7.3 Linear least square fit

Lecture 12 (Tue, Mar 5) Gaussian elimination and LU factorization (II)

LU Factorization. LU factorization is the most common way of solving linear systems! Ax = b LUx = b

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White

Chapter 4. Solving Systems of Equations. Chapter 4

CHAPTER 8: MATRICES and DETERMINANTS

Designing Information Devices and Systems I Fall 2018 Lecture Notes Note Introduction to Linear Algebra the EECS Way

Introduction to Scientific Computing

Department of Mathematics California State University, Los Angeles Master s Degree Comprehensive Examination in. NUMERICAL ANALYSIS Spring 2015

Iterative methods for Linear System

Solving Dense Linear Systems I

Math 5630: Iterative Methods for Systems of Equations Hung Phan, UMass Lowell March 22, 2018

JACOBI S ITERATION METHOD

Numerical Linear Algebra

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

MATH 3511 Lecture 1. Solving Linear Systems 1

Direct Methods for solving Linear Equation Systems

Lecture 1 Systems of Linear Equations and Matrices

Linear Algebraic Equations

Lecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University

Example: 2x y + 3z = 1 5y 6z = 0 x + 4z = 7. Definition: Elementary Row Operations. Example: Type I swap rows 1 and 3

CS412: Lecture #17. Mridul Aanjaneya. March 19, 2015

Linear Algebra. Matrices Operations. Consider, for example, a system of equations such as x + 2y z + 4w = 0, 3x 4y + 2z 6w = 0, x 3y 2z + w = 0.

Iterative Solution methods

The purpose of computing is insight, not numbers. Richard Wesley Hamming

Designing Information Devices and Systems I Spring 2018 Lecture Notes Note Introduction to Linear Algebra the EECS Way

Up to this point, our main theoretical tools for finding eigenvalues without using det{a λi} = 0 have been the trace and determinant formulas

Applied Linear Algebra in Geoscience Using MATLAB

Numerical Solution Techniques in Mechanical and Aerospace Engineering

Here is an example of a block diagonal matrix with Jordan Blocks on the diagonal: J

Linear Systems of n equations for n unknowns

Chapter 7. Tridiagonal linear systems. Solving tridiagonal systems of equations. and subdiagonal. E.g. a 21 a 22 a A =

Math 1080: Numerical Linear Algebra Chapter 4, Iterative Methods

Linear Algebra March 16, 2019

Practical Linear Algebra: A Geometry Toolbox

CHAPTER 5. Basic Iterative Methods

Introduction. Math 1080: Numerical Linear Algebra Chapter 4, Iterative Methods. Example: First Order Richardson. Strategy

Linear Equations in Linear Algebra

Fundamentals of Engineering Analysis (650163)

CHAPTER 8: MATRICES and DETERMINANTS

Transcription:

Math 47 (Numerical methods) Chapter 3 (second half). System of equations Overlap 3.5 3.8 of Bradie 3.5 LU factorization w/o pivoting. Motivation: ( ) A I Gaussian Elimination (U L ) where U is upper triangular and L is lower triangular. Then, the entire Gaussian elimination process amounts to multiplying the augmented matrix from the left with L. Thus L A = U = A = L U =: LU Why LU? (for less operation count but not for stability) When used for solving linear system A x = b, i.e. { L y = LU x = b solved with forward substitution b U x = y solved with backward substitution, it helps reducing the operation count from O(n 3 ) in the original Gaussian Elimination to O(n 2 ) in the forward/backward substitution for solving lower/upper triangular systems. Also, once LU factorization is done, it can be repeatedly used for solving multiple linear systems with the same coefficient matrix but different right-hand-side vectors. Note that LU factorization itself, in the most general case, requires O(n 3 ) operations. The reason is exactly the same as for the op. count of Guassian Elimination. In details, each row operation performed on the coefficient matrix amounts to a left-multiplication of an elementary matrix. Moreover, each one of these matrices (except those representing row-swapping) is lower triangular, and therefore their product is also lower triangular we showed in class that product of lower triangular matrices is still lower triangular.

Theorem If we perform a row operation add m [row i] to [row j] on A and arrive at A, then MA = A, where M = m.... Here, the elementary matrix M is an identity matrix superposed with an entry m at the j-th row and i-th column. Similarly, Theorem 2 If we perform a column operation add m [column i] to [column j] on A and arrive at A 2, then AM = A 2, where M = m. Here, the elementary matrix M is an identity matrix superposed with an entry m at the i-th row and j-th column. In principle If a matrix M represents a row/column operation, it is obtained by performing the same operation on the identity matrix. One should left-multiply M with the target matrix if M stands for row operation and right-multiply M with the target matrix if M stands for column operation. ************ The part below is optional but can be helpful ************* Now, we make an observation. On the j-th step of the Gaussian elimination of zeroing out the lower part of the j-th column, multiple row operations are performed which can be represented by a product N m...n 2 N. If we perform this same sequence of row operations on I, each N k corresponding to adding a rescaled [row j] to some row below. Therefore, 2

the s on the diagonal will be kept and each entry in the lower part of the j-column will represent the factors used in each rescaling of [row j], N m...n 2 N = m j+,j..... m n,j So, we use exactly n lower triangular matrices M,..., M n. Each M j represents the operations collectively performed for zeroing out the j-th column of A, which suggests an algorithm to find L. M n...m 2 M A = U () In practice, one does a bookkeeping of the Gaussian elimination a sequence of row operations and stores the information in the M s. The final product L = M M2...Mn in () can be easily computed thanks to the following facts... =... @. @........ = @ @... @ & @ & The proofs of these facts are skipped. But note that the last equation holds true for M i M j M k only when i < j < k. *******************end of optional part************************* A final remark. From the above discussion, we see L = L is a by-product obtained as an intermediate result from Guassian Elimination. Why not just use L directly? The answer is yes if both L and L are density matrix with O(n 2 ) nonzero entries. In such case, solving L x = b. with forward substitution; or 3

2. using x = L b both cost O(n 2 ) operations. In other words, there is no siginificant difference in using L and L in terms of efficiency. However, when A is a sparse matrix with m nonzero entries (m is much less than n 2 ), it s LU decomposition will very likely preserve such sparsity pattern, which brings the operation count for solving L x = b down to O(m). On the other hand, L may be a dense matrix with O(n 2 ) nonzero entries, which means calculation of L b requires the same O(n 2 ) operations, far more than O(m). In Section 3.6 Direct Factorization, we will see an alternative approach that obtains L, U without using Guassian Elimination. There, the case of sparse matrices will be discussed in details. 3.5 (cont d) LU factorization with pivoting. LU decomposition, being equivalent to the Gaussian elimination, has the same problem of having zero (or very small) pivot entries. Pivoting is important in some cases. So row interchange needs to be added to the LU algorithm. Using the same matrix language, row interchange is characterized with permutation matrix: A swapping row i and row j A amounts to P A = A where permutation matrix P = 0... 0... is a matrix resulting from swapping the i-th and j-th row of the identity matrix. Similarly, with the same P given above, A swapping column i and column j A 2 amounts to AP = A 2 4

Thus, Gaussian elimination with pivoting can be described with addition of permutation matrices. A = (M n P n...m 2 P 2 M P ) U (2) Problem! Row interchange destroys the lower triangular pattern in M so using (2) directly will not yield a lower triangular matrix L. Remedy. The following properties of permutation matrix are useful. (The proofs are discussed in class.) P 2 = I, P i M j = ˆM j P i for i > j. The key point here is to change the order of P s and M s in the LHS of (2) so that it becomes A = [ ( ˆM n... ˆM 2 ˆM )(P n...p 2 P )] U and therefore P A = LU. This version of the LU decomposition, as efficient as the original LU, is more stable because of pivoting. Note. Now, solving A x = b amounts to solving P A x = P b, i.e. LU x = P b. The additional calculation of P b is just permutation of entries in b, which costs O(n) operations. 3.6 Direct Factorization There is a way to find LU factorization if we completely forget about Guassian Elimination. It gets heuristic of the most elementary idea: why not solve A = LU by treating it as n 2 equations? a, a,2 a,3... a,n l, 0 0... 0 u, u,2 u,3... u,n a 2, a 2,2 a 2,3... a 2,n l 2, l 2,2 0... 0 0 u 2,2 u 2,3... u 2,n a 3, a 3,2 a 3,3... a 3,n = l 3, l 3,2 l 3,3... 0 0 0 u 3,3... u 3,n......... a n, a n,2 a n,3... a n,n l n, l n,2 l n,3... l n,n Each entry a i,j gives an equation in terms of the unknown l s and u s a i,j = [row i] of L [column j] of U 5 0 0 0... u n,n

that is a i,j = l i, u,j + l i,2 u 2,j +... + l i,n u n,j = Notice the running index k. n l i,k u k,j. (3) However, solving these n 2 equations in a most straightforward way, i.e. with Guassian Elimination, will cost O(n 2 ) 3 operations which far too expensive. We have to take advantage of the lower and upper triangular structure of L and U, and come up with a smarter algorithm. First, let s fix the diagonal entries of U as. One can also fix the diagonal entries of L as. The former called the Crout method and the latter the Doolittle method. But here, we adopt u i,i =. (4) Then, we scan thought A row-by-row, writing down a i,j in terms of [row i] of L and [row j] of U while taking into account the zeros of L and U. It turns out, Scanning of [row i] of A yields values of the same row in L and U. This is obvious for [row ]. Equation with a,, a, = l, u, gives l, due to (4). Equation with a,j when j > k= a,j = l, u,j only has term on the right hand side due to the lower triangular structure of L. Since l, is solved from above, one easily solves for u i,j. In a general step involving [row i], we can still follow the above two-part procedure, the first part finding l i,j with j i and the second part finding u i,j with j > i. This is best described in a for loop 6

%Scanning of [row i]. Here, we should have already obtained values of %[row ]... [row j-] in L and U For j=:n %Each iteration uses a i,j to find either l i,j or u i,j %Here, we should have already obtained values of l i,... l i,j %and u i,... u i,j some of which are simply zero If i j, the sum in (3) stops at l i,j u j,j a i,j = l i, u,j + l i,2 u 2,j +... + l i,j u j,j but every term in this equation, except l i,j has already been obtained (see the above comments about the availability of the l s and u s, also in below) a, a,2 a,3... a,n l,... 0... u,n...... 0... u j,j =... a i,... a i,j... a 3,n = l i,... l i,j... 0 0... u i,i =... u i,n......... a n, a n,2 a n,3... a n,n l n, l n,2 l n,3... l n,n 0 0 0... u n,n where green color indicates entry value is available and red indicates the entry we can solve. Therefore l i,j = u j,j [a i,j (l i, u,j +... + l i,j u j,j )]. (5) If i < j, the sum in (3) stops at l i,i u i,j a i,j = l i, u,j + l i,2 u 2,j +... + l i,i u i,j but every term in this equation, except u i,j has already been obtained a, a,2 a,3... a,n l,... 0... u,n...... 0... a i,... a i,j... a 3,n = l i,... l i,i... 0 0... u i,j... u 3,n......... a n, a n,2 a n,3... a n,n l n, l n,2 l n,3... l n,n 0 0 0... u n,n Therefore, end of for j=:n u i,j = l i,i [a i,j (l i, u,j +... + l i,i u i,j )] (6) 7

So, the final algorithm for the Crout method is just to combine (5) and (6) into another loop For i=:n insert (5) and (6) here end of for i=:n What about operation count in the above algorithm. Notice that we need to solve for n 2 entries of l i,j or u i,j, depending on i j or i < j. Each solving uses either (5) or (6), which obviously costs O(n) operations at most. So, in total, the Crout method needs O(n 2 ) O(n) = O(n 3 ) operations, same as the Gaussian Elimination!! The main advantage of direct factorization, however, rises when A is sparse, especially when the nonzero entries are close to the diagonal. It is very likely that the sparse pattern of A will be propagated into L and U so that many of equation (5) and (6) are simply 0 = 0 which requires no operation. We illustrate this phenomenon using a classical example Example. Tridiagonal matrix a b c 2 a 2 b 2 c 3 a 3 b 3 A =......... c n a n b n It has LU decomposition (a la the Doolittle method) with L, U sharing the same sparsity pattern indeed, both are bi-diagonal u b l L = 2......, U = u 2 b 2... l n c n a n... Note that the values of b, b 2... from A are all preserved in U, which can be checked easily by expressing the b i in A in terms of entries from L and U. The design of algorithm for this problem follows exactly the same procedure described above but should skip all the trivial equations 0 = 0. The reader is recommended to perform the algorithm on a 5-by-5 tri-diagonal matrix by hand for a thorough understanding. u n (7) (8) 8

For i=:n We scan [row i] of A. There are at most 3 nonzero entries. Use a typical row with 3 entries, c i, a i, b i. Note, at this point the values of l...l i and u i...u i should already be available. c i is at the (i, i) position of A, so c i = l i u i = l i = u i /c i a i is at the (i, i) position of A, so a i = l i b i + u i = u i = a i l i b i where l i is solved above and b i appears in the (i, i) position of U. end of for i=:n Operation counts. Factorizatrion, O(n) (why??); Solving A x = b, O(n) (why??). So much faster than O(n 3 ). 3.8 Iterative methods for solving linear systems. First, review of vector and matrix norms. vector norms: e.g. x = max{ x,..., x n }, x 2 = ( x 2 + x 2 2 +... + x n 2 ) /2, x = x + x 2 +... + x n. matrix norms deduced from vector norms A = max x 0 A x x. Note each specific vector norm is associated with a matrix norm, e.g. A = max x 0 A x x and we have learned that A = max i a ij, max row sum j have proved that A x A x, AB A B 9

Now, iterative methods for solving A x = b, x k+ = B x k + c (9) where B is the iteration matrix. ASSUME the above iteratin converges x = lim k x k. By taking the limit on (9), x = B x + c (I B) x = c. Thus, we require the above equation to be equivalent to A x = b. That is, theoretically, (I B) c = A b. Example. Jacobi method. To solve A x = b, Splitting. A = D + L + U where D contains only the diagonal entries of A, L the stricly lower triangular part and U the strictly upper triangular part. Other entries of D, L, U are filled with zeros. (Note: the matrices L, U here are completely irrelavent to the LU decomposition) Then the iteration scheme is derived as A x = b (D + L + U) x = b D x = (L + U) x + b thus x k+ = D (L + U) x k + D b = In practice, the scheme is usually written as D x k+ = (L + U) x k + b (0) since we only need to solve for x k+ and don t have to find D. Example. Apply the Jacobi method to the following system 3 0 A x = b where A = 3 and b = 2. 3 2 6 3 Splitting 3 0 0 0 0 0 0 0 A = D + L + U = 0 3 0 + 0 0 + 0 0 0 0 6 3 2 0 0 0 0 0

Using (0), we have and in terms of components x k = D x k+ = (L + U) x k + b x (k) x (k) 2 x (k) 3, 3x (k+) = x (k) 2 + 3x (k+) 2 =x (k) + x (k) 3 + 2 6x (k+) 3 =3x (k) 2x (k) 2 + 3 Operation count. In each step of Jacobi iteration, the operation count is O(n 2 ). Why? Multiplication (L + U) x k costs O(n 2 ) operations or more precisely, O(ˆn) with ˆn being the number of nonzero entries of A. Then, solving for x k+ in D x k+ =... only takes O(n) operations because D is a diagonal matrix! So the total operation count is O(n 2 m) or O(ˆn m) where m is the number of iterations performed. Comparison with Gaussian Elimination and LU factorization. These two both require O(n 3 ) operations. So, for the Jacobi method, it can save computation time if m << n. The trade-off, however, is loss of accuracy that gets better as m gets larger. (Think about an iterative method learned before the Newton s method for solving f(x) = 0) Note. In some situations, speedy algorithms are more crucial than perfect accuracy! Error analysis for the Jacobi method. Let s mimic the techinque used for the Newton s method, that is, subtract the iterative equation x k+ = B x k + c from the exact one to get x = B x + c x x k+ = B( x x k ) in terms of error = e k+ = B e k, after k iterations = e k = B k e 0 take vector norms = e = B k e 0 recall the properties of norms = e B k e 0.

Conclusion. The scheme converges if B < in certain norm! And the rate of convergence is O( B n ). Example. Use the previous example. The scheme is D x k+ = (L + U) x k + b i.e. x k+ = D (L + U) x k + D b so 0 /3 0 B = D (L + U) = /3 0 /3 3/6 2/6 0 Now, pick a suitable norm and simply compute (see the review above) B = 5 6 < the scheme used in the example guarantees convergence and the convergence rate is ( ) k 5 e k e 0 6 Example. Show that the Jacobi method must converge if A is diagonally dominant, that is, a ii > Hint: consider the infinity-norm. n a ij for all i =, 2, 3,...n j= j i 3.8 (cont d) Guass-Seidel Method Have learned the Jacobi method x k+ = D (L + U) x k + D b Here, L, D, U come from the splitting A = L + D + U. We define the Jacobi iteration matrix B Jac def = D (L + U) and we know that B Jac < for diagonally dominant matrix A. Another iterative method for solving A x = b: Gauss-Seidel method First, splitting A = L + D + U as in Jacobi method. equivalent forms A x = b (L + D + U) x = b Then, rewrite A x = b in (L + D) x = U x + b () x = (L + D) U x + (L + U) b 2

By the last equation of () the iteration scheme has the same formality as before but with a different iteration matrix x k+ = B GS x k + c = (L + D) U x k + (L + U) b B GS def = (L + D) U. In the case of A being sparse, however, (L+D) is mostly like to be dense with O(n 2 ) nonzero entries, increasing the operation count. Instead, we use the second equation of () and solve for x k+ in (L + D) x k+ = U x k + b which requires O(ˆn) operations with ˆn the number of nonzero entries in L + D. Example. Splitting 3 0 A x = b where A = 3 and b = 2. 3 2 6 3 0 0 0 0 0 0 0 A = D + L + U = 0 3 0 + 0 0 + 0 0 0 0 6 3 2 0 So the Gauss-Seidel iteration scheme is 3 0 0 0 0 3 0 x k+ = 0 0 x k + b 3 2 6 0 0 0 3 0 0 0 and in each step x k+ is solved using forward substitution. The operation count is O(n 2 ). Example. Find an iteration scheme to solve the following n-by-n system (which is NOT tri-diagonal anymore) with the Gauss-Seidel iteration. Compare the operation counts of two theoretically equivalent approaches x k+ = (L + D) U x k + (L + U) b 3

and (L + D) x k+ = U x k + b. 2 0.3 2 2 A x = b, A =......... 2 0.5 2 The scheme is easy to write. What about operation counts? In using x k+ = (L + D) U x k + (L + U) b, each iteration step costs O(n 2 ) operations because (L + D) is a dense matrix with O(n 2 ) nonzero entries (try this with Matlab). On the other hand, using (L + D) x k+ = U x k + b only requires O(n) operations in each iteration step, thanks to the sparse structure of 2 0 2 0 2 0 L+D =... 2 0 0.5 2 Example. 0 0.3 0 0 0 0 and U =... 0 0 0 0 Do an error analysis on the previous example for n = 0 in terms of the infinity norm B GS. Compare the infinity norm with B Jac. With the help of Matlab, we can compute >> n=0; D=eye(n)*(-2); L=D*0; U=D*0; >> for i=:n- L(i+,i)=; U(i,i+)=; end; >> L(n,)=0.5; U(,n)=0.3; >> B=-inv(L+D)*U; >> rs=sum(abs(b),2); max(rs) Here, in the last line, rs=sum(abs(b),2) computes the row sum of B in absolute value. The second parameter of the function sum indicates column sum or row sum. 4

The answer is B GS = 0.9975 = e k+ 0.9975 e k convergent! The infinity norm of B Jac = D (L + U) can be easily computed by hand. B Jac = = e k+ e k no conclusion on convergence However, numerical experiment shows that the Jacobi method still converges in this case even though B Jac =. To this end, we introduce a fundamental quantity that determines the convergence rate of iterative schemes for linear systems... 3.8 (cont d) Spectral radius The spectral radius of a matrix, defined as, ρ(b) = max{abs(eigenvalues of B)} is the most fundamental quantity in deciding the convergene of the iterative method x k+ = B x k + c. (2) In fact, we have the following theorem Theorem 3 Let x be the exact solution and e = x x k be the error. Then, Proof. (Math 57) ρ(b) = lim k e k+ e k This theorem implies that, if ρ(b) <, then e k decays almost like (ρ(b)) k for large k s and therefore the method (2) converges. With different wording, we can say the iteration converges at order with asymptotic constant λ = ρ(b). (Now it seems reasonable to use λ here since it is also a common symbol for eigenvalues). Theorem 4 For any matrix norm induced upon a vector norm, it is always true that ρ(b) B. 5

Proof We let λ be the eigenvalue with the largest absolute value so that ρ(b) = λ. Let u be the associated eigenvector so that B u = λ u. Then, by the definition of matrix norm DONE. B = max u 0 B u u = λ u u B u u = λ = ρ(b). This theorem asserts that the spectral radius serves as a lower bound of all matrix norms induced from vector norms. So if one can use B < to show convergence, then ρ(b) < will also show convergence. On the other hand, B > does not necessarily imply divergence whereas ρ(b) ( does guarantee ) divergence. 2 2 Example. Given matrix A =. 2 3 i) Compute ρ(b Jac ), B Jac, B Jac 2. What convergence rate does each one of them tell us? ii) Compute ρ(b GS ), B GS, B GS 2. What convergence rate does each one of them tell us? Solution. Split ( ) ( ) ( ) 0 0 2 0 0 2 A = L + D + U = + +. 2 0 0 3 0 0 ( ) 0 i) B Jac = D (L + U) = 2/3 0 ) det(b Jac λi) = det = λ 2 + 2 3 ( λ 2/3 λ Set the above equation = 0 and solve for λ Note. 2 λ,2 = ± 3 i Imaginary numbers appear in the analysis of a real-number problem. 6

2 So ρ(b Jac ) = i 3 = 2 <. Thus, the Jacobi method converges at a rate e 3 k ( k 2 3) as k. Now, easy to compute that B Jac =. This condition alone does not imply convergence or divergence (inconclusive). ( ) 4/9 0 Also, compute (B Jac ) T B Jac = and find its eigenvalues λ,2 = ± 2 3 0 so that B Jac 2 = max λ i ((B Jac ) T 2 B Jac ) =. So the 2-norm of B 3 Jac implies the same convergence rate as ρ(b Jac ) does. ( ) 0 ii) B GS = (L + D) U =. 0 2/3 det(b GS λi) = det Set the above equation = 0 and solve for λ ( λ 0 2 3 λ λ = 2 3, λ 2 = 0 ) = λ 2 + 2 3 λ So ρ(b GS ) = 2 3 <. Thus, the Jacobi method converges at a rate e k ( 2 3) k as k. The G-S method converges faster than the Jacobi method in this example. Now, easy to compute that B GS =. Again, this condition alone does not imply convergence or divergence (inconclusive). ( ) 0 0 Also, compute (B GS ) T B GS = and find its eigenvalues λ = 0, λ 2 = 3 9 0 3/9 so that B GS 2 = max λ i ((B GS ) T 3 B GS ) = >. So the 2-norm of B 9 GS gives no information on the convergence of the G-S method. 3.8 (cont d) The SOR (successive over-relaxation) method Before we go to the SOR method, let s discuss more about spectral radius. As a theoretical tool, spectral radius is indeed not easy to find in practice. For a handful of special matrices, though, we do know something about the convergence of Jacobi and Gauss-Seidel methods. We state without proving the following properties.. The Jacobi method converges for strictly diagonally dominant matrices A (see HW 5). That is, ρ(b Jac ) <. The Gauss-Seidel method also converges in this case. 7

2. The Gauss-Seidel method converges for symmetric positive definite (definition below) matrix A. That is, ρ(b Jac ) <. 3. For any A with a ii > 0 and a ij 0 (i j), one of the following cases has to be true 0 ρ(b GS ) ρ(b Jac ) < ρ(b Jac ) ρ(b GS ) In other words, the Gauss-Seidel method either converges faster than the Jacobi method or diverges faster than the Jacobi method. To show exactly convergence, one needs further argument. Definition A square matrix A is symmetric if A T positive definite if x T A x > 0 = A. A symmetric matrix A is for all nonzero x. Equivalently, A is symmetric positive definite if all its eigenvalues are positive. Example. Show that the Jacobi method converges for the difference matrix 2 2 2 A x = b, A =......... 2 2 Proof. A is diagonally dominant since a ii n j= a ij is true for all rows. But A is not strictly diagonally dominant because a ii > n j= a ij is true only for i =, n. So, we can not directly use the above statements. Instead, let s prove ρ(b Jac ) < by some careful argument. Proof by contradiction. Assume ρ(b Jac ). Let λ be the leading eigenvalue so that λ = ρ(b Jac ). Let the associated eigenvector be v = (v, v 2,..., v n ) T so that B Jac v = λ v. (3) 8

Now we know that, except the first and last row, the i-th row of B Jac is (0,...,, 0,,...0) 2 2 with in the (i-) and (i+) column. So the i-th row of B 2 Jac v is v 2 i v 2 i+. Plug it into equation (3), we have 2 v i 2 v i+ = λ v i for i = 2, 3,..., n (4) Let v k be the entry of v with the largest eigenvalue. By (4), we have λv k = 2 v k 2 v k+ 2 ( v k + v k+ ) where the inequality is due to the triangle inequality. Since by assumption, λ and v k v k±, the above inequality has to be an equality λv k = 2 ( v k + v k+ ) which leaves us with only one possibility v k = v k = v k+. So v k and v k+ have the same absolute value as v k and they all have the largest absolute value among all v, v 2,..., v n. We can propagate the above argument to v k 2 and v k+2, then v k 3 and v k+3... Eventually, all v i s share the same absolute value. In particular v = v 2. But by inspecting the first row of equation (3), we have 2v = v 2 which yields v = 0 and all entries of v are zero. Contradiction to the definition of eigenvectors! DONE. Example. For the same matrix A as above, show that the Gauss-Seidel method converges and converges no slower than the Jacobi method. Proof Since the above example verifies ρ(b Jac ) <, we use statement 3 on page 9 for A to conclude that ρ(b GS ) ρ(b Jac ) <. DONE. The SOR method. Motivation: can we improve the convergence rate, i.e. lower ρ(b) while maintaining the same level of operation count? Idea: the G-S method, in its exact form, is x = D ((L + U) x ) b 9

Take a linear combination of the LHS and RHS with weights ω and ω. It should give us x again Eliminate the inverse term D x = ( ω) x ωd ((L + U) x ) b ( D x = ( ω)d x ω (L + U) x ) b Assign index k + to the D x term on the LHS and the L x term on the RHS. Assign index k to the rest terms. (This way, the coeffient matrix of x k+ is lower triangular which we have fast algorithm to solve.) D x k+ = ( ω)d x k ω ( L x k+ U x k ) b The above form is the SOR method used in practice. Theoretically, the iteration can be written as x k+ = B ω SOR x k + c where B ω SOR = (D + ωl) (( ω)d ωu). Note. The Gauss Seidel method amounts to ω =. It is easy to see that the SOR method has the same operation count as the G-S method, thanks to their lower triangular structure. Now, since ω is a free paramter, can we find some optimal value ω such that ρ(b ω SOR ) is small, especially smaller than B GS? The answer is yes if A has special structures. Theorem 5 If A is symmetric postive definte and is (block) tri-diagonal, then ρ(b GS ) = ρ 2 (B Jac ) <. And the optimal value for the SOR method is ω = 2 + ρ(b GS ) in which case the spectral radius ρ(b ω SOR) = ω. Note. By simple manipulation, it is not difficult to show that ρ(bsor ω ) < ρ(b GS) < ρ(b Jac ). Thus, in this special case (positive defnite and tri-diagonal), the SOR method converges faster than GS faster than Jacobi. 20