Goal: to construct some general-purpose algorithms for solving systems of linear Equations

Chapter IV Solving Systems of Linear Equations Goal: to construct some general-purpose algorithms for solving systems of linear Equations

4.6 Solution of Equations by Iterative Methods

4.6 Solution of Equations by Iterative Methods The Gaussian algorithm and its variants (C/) are called direct methods ( {) for solving the problem Ax = b. They proceed through a finite number of steps and produce a solution x that would be completely accurate were it not for roundoff errors. An indirect method (m{), by contrast, produces a sequence of vectors that ideally converges to the solution. The computation is halted when an approximate solution is obtained having some specified accuracy or after a certain number of iterations.

) 5 küa {µ {µïlk ÚOŽ)nØþ ( OŽE,Ý O(n 3 ) þ? ;þ O(n 2 )þ? 'Ü unœ¹. S {µkø Ó^; m S y{ü cù ^u )Œ.DÕÝ(=Xê Ý ¹kŒþ0ƒ) K.

) 5 küa {µ {µïlk ÚOŽ)nØþ ( OŽE,Ý O(n 3 ) þ? ;þ O(n 2 )þ? 'Ü unœ¹. S {µkø Ó^; m S y{ü cù ^u )Œ.DÕÝ(=Xê Ý ¹kŒþ0ƒ) K. S éõe, K 'Xœht& Ï~ 8( )Œ. n >> 1 DÕÝ 5 ê. dž S { ^ œ

( )S {

( )S { é ( )F(x) = 0 S { ÄÚ½µ

( )S { é ( )F(x) = 0 S { ÄÚ½µ 1 Ed/ª F(x) = 0 x = Φ(x).

( )S { é ( )F(x) = 0 S { ÄÚ½µ 1 Ed/ª F(x) = 0 x = Φ(x). 2 Ü ÐŠ x 0 ÑS ª x (k+1) = Φ(x (k) ).

( )S { é ( )F(x) = 0 S { ÄÚ½µ 1 Ed/ª F(x) = 0 x = Φ(x). 2 Ü ÐŠ x 0 ÑS ª x (k+1) = Φ(x (k) ). 3 e4 x lim k + x (k) 3 Kx ). SOŽ ž S x (k+1) x (k) < εžêž.

( )S { é ( )F(x) = 0 S { ÄÚ½µ 1 Ed/ª F(x) = 0 x = Φ(x). 2 Ü ÐŠ x 0 ÑS ª x (k+1) = Φ(x (k) ). 3 e4 x lim k + x (k) 3 Kx ). SOŽ ž S x (k+1) x (k) < εžêž. e4 lim x (k) Ø3 K }. IE#S ª½ Ð k + Š x (0).

( )S { é ( )F(x) = 0 S { ÄÚ½µ 1 Ed/ª F(x) = 0 x = Φ(x). 2 Ü ÐŠ x 0 ÑS ª x (k+1) = Φ(x (k) ). 3 e4 x lim k + x (k) 3 Kx ). SOŽ ž S x (k+1) x (k) < εžêž. e4 lim x (k) Ø3 K }. IE#S ª½ Ð k + Š x (0). þs x (k) x lim k x (k) x = 0 ^S { ) 5 ê ÄgŽ )š 5 ÄgŽƒÓ.

Consider the equation Ax = b

Consider the equation Ax = b x = Gx + g.

Consider the equation Ax = b x = Gx + g. A certain nonsingular matrix Q, called the splitting matrix, is prescribed, then the problem above is equivalent to Qx = (Q A)x + b

Consider the equation Ax = b x = Gx + g. A certain nonsingular matrix Q, called the splitting matrix, is prescribed, then the problem above is equivalent to Qx = (Q A)x + b This suggests an iterative process, defined by or Qx (k) = (Q A)x (k 1) + b (k 1) (1) x (k) = (I Q 1 A)x (k 1) + Q 1 b (k 1) (2) where x (0) is an arbitrary initial vector.

Now, our objective is to choose Q ( Q 0) s.t.

Now, our objective is to choose Q ( Q 0) s.t. 1 The Sequence {x (k) } is easily computed. 2 The Sequence {x (k) } converged rapidly to the solution of Ax = b.

Now, our objective is to choose Q ( Q 0) s.t. 1 The Sequence {x (k) } is easily computed. 2 The Sequence {x (k) } converged rapidly to the solution of Ax = b. Obviously, the exact solution x satisfies Qx = (Q A)x + b x = (I Q 1 A)x + Q 1 b Clearly x is a fixed point of F(x) (I Q 1 A)x + Q 1 b.

Thus, if I Q 1 A < 1, one can conclude that lim x (k) x = 0 k

Thus, if I Q 1 A < 1, one can conclude that lim x (k) x = 0 k Observe that I Q 1 A < 1 implies the invertibility of Q 1 and A. Hence, we have:

Thus, if I Q 1 A < 1, one can conclude that lim x (k) x = 0 k Observe that I Q 1 A < 1 implies the invertibility of Q 1 and A. Hence, we have: Theorem (on Iterative Method Convergence) If I Q 1 A < 1 for some subordinate matrix norm, then the sequence produced by (1) converges to the solution of Ax = b for any initial vector x (0).

Remark The matrix G I Q 1 A is usually called the iteration matrix. If δ I Q 1 A < 1, then we can use the stopping condition for the iterative method as follows: x (k) x where ɛ is the tolerance. δ 1 δ x (k) x (k 1) < ɛ

Richardson Method

Richardson Method The splitting matrix is Q = I and the iteration matrix is G = I Q 1 A = I A. So, the iteration formula is x (k) = (I A)x (k 1) + b = x (k 1) + r (k 1) where r (k 1) is the residual vector, defined by r (k 1) = b Ax (k 1). With the preceding theorem, one can easily show that if I A < 1, then the sequence x (k) generated by the Richardson iteration will converge to a solution to Ax = b.

Exercise (in class) Find or write down the explicit form for the iteration matrix G = I Q 1 A and state the iteration formula in the Richardson method for the problem Ax = 1 1 2 1 3 1 3 1 1 2 1 2 1 3 1 x 1 x 2 x 3 = 11 18 11 18 11 18 Show that the Richardson method is successful (i.e., x (k) A 1 b ) for this problem.

For example, with initial guess x (0) = (0, 0, 0) T, the sequence {x (k) } generated by the Richardson method is: x (0) x (1) x (10) = (0.00000, 0.00000, 0.00000) T = (0.61111, 0.61111, 0.61111) T. = (0.27950, 0.27950, 0.27950) T.

For example, with initial guess x (0) = (0, 0, 0) T, the sequence {x (k) } generated by the Richardson method is: x (0) x (1) x (10) x (40) = (0.00000, 0.00000, 0.00000) T = (0.61111, 0.61111, 0.61111) T. = (0.27950, 0.27950, 0.27950) T. = (0.33311, 0.33311, 0.33311) T.

For example, with initial guess x (0) = (0, 0, 0) T, the sequence {x (k) } generated by the Richardson method is: x (0) x (1) x (10) x (40) x (80) = (0.00000, 0.00000, 0.00000) T = (0.61111, 0.61111, 0.61111) T. = (0.27950, 0.27950, 0.27950) T. = (0.33311, 0.33311, 0.33311) T. = (0.33333, 0.33333, 0.33333) T

Example Discuss whether the Richardson method is successful for the problem Ax = 1 1 2 0 0 1 2 1 1 2 0 0 1 2 1 1 2 0 0 1 2 1 x = 1 1 1 1

Solution.

Solution. Splitting matrix Q = I and iteration matrix G 0 1 0 0 2 G = I Q 1 1 A = I A = 0 1 0 2 2 0 1 0 1 2 2 0 0 1 0 2

Solution. Splitting matrix Q = I and iteration matrix G 0 1 0 0 2 G = I Q 1 1 A = I A = 0 1 0 2 2 0 1 0 1 2 2 0 0 1 0 2 Check G 1 = 1,

Solution. Splitting matrix Q = I and iteration matrix G 0 1 0 0 2 G = I Q 1 1 A = I A = 0 1 0 2 2 0 1 0 1 2 2 0 0 1 0 2 Check G 1 = 1, G = 1

Solution. Splitting matrix Q = I and iteration matrix G 0 1 0 0 2 G = I Q 1 1 A = I A = 0 1 0 2 2 0 1 0 1 2 2 0 0 1 0 2 Check G 1 = 1, G = 1 Recall ρ(g) G.

0 = λi G = det λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ

0 = λi G = det λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 0 = = λ 4 3 4 λ2 + 1 16

0 = λi G = det λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 0 = = λ 4 3 4 λ2 + 1 16 λ2 = 3 8 ± 5 8

0 = λi G = det λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 0 = = λ 4 3 4 λ2 + 1 16 λ2 = 3 8 ± 5 8 < 1

0 = λi G = det λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 0 = = λ 4 3 4 λ2 + 1 16 λ2 = 3 8 ± 5 8 < 1 ρ(g) < 1 G 2 = ρ(g) < 1

0 = λi G = det λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 0 = = λ 4 3 4 λ2 + 1 16 λ2 = 3 8 ± 5 8 < 1 ρ(g) < 1 G 2 = ρ(g) < 1 Conclusion: The Richardson method works!

Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A.

Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A. Jacobi Method

Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A. Jacobi Method In Jacobi Method, the splitting matrix Q = D and the iterative matrix G = D 1 (L + U) = I Q 1 A

JacobiS ª: þ/ª9ù3^ k 5 ê Ax = b a 11 x 1 + a 12 x 2 + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + a 2n x n = b 2. a n1 x 1 + a n2 x 2 + a nn x n = b n

JacobiS ª: þ/ª9ù3^ k 5 ê Ax = b a 11 x 1 + a 12 x 2 + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + a 2n x n = b 2. a n1 x 1 + a n2 x 2 + a nn x n = b n JacobiS ª þ/ª a 11 x (k) 1 = (a 12 x (k 1) 2 + a 13 x (k 1) 3 + + a 1n x (k 1) n b 1 ) a 22 x (k) 2 = (a 21 x (k 1) 1 + a 23 x (k 1) 3 + + a 2n x (k 1) n b 2 ). a nn x n (k) = (a n1 x 1 (k 1) + a n2 x 2 (k 1) + + a n,n 1 x n 1 (k 1) b n )

~µ ^Jacobi {)e 2x 1 x 2 x 3 = 5 x 1 + 5x 2 x 3 = 8 x 1 + x 2 + 10x 3 = 11 )µ

~µ ^Jacobi {)e 2x 1 x 2 x 3 = 5 x 1 + 5x 2 x 3 = 8 x 1 + x 2 + 10x 3 = 11 )µ ÝQ = diag{2, 5, 10}, ƒajacobis ª þ/ªµ x (k) 1 = 0.5x (k 1) 2 + 0.5x (k 1) 3 2.5 x (k) 2 = 0.2x (k 1) 1 + 0.2x (k 1) 3 + 1.6 x (k) 3 = 0.1x (k 1) 1 0.1x (k 1) 2 + 1.1

~µ ^Jacobi {)e 2x 1 x 2 x 3 = 5 x 1 + 5x 2 x 3 = 8 x 1 + x 2 + 10x 3 = 11 )µ ÝQ = diag{2, 5, 10}, ƒajacobis ª þ/ªµ x (k) 1 x (k) 2 x (k) 3 x (k) 1 = 0.5x (k 1) 2 + 0.5x (k 1) 3 2.5 x (k) 2 = 0.2x (k 1) 1 + 0.2x (k 1) 3 + 1.6 x (k) 3 = 0.1x (k 1) 1 0.1x (k 1) 2 + 1.1 = 0 0.5 0.5 0.2 0 0.2 0.1 0.1 0 x (k 1) 1 x (k 1) 2 x (k 1) 3 + 2.5 1.6 1.1

JacobiS ŒÂñ?

JacobiS ŒÂñ? ÁÁOŽ G < 1?

JacobiS ŒÂñ? ÁÁOŽ G < 1? 2Á G 1 = 0.7 < 1.

JacobiS ŒÂñ? ÁÁOŽ G < 1? 2Á G 1 = 0.7 < 1. JacobiS Âñ.

JacobiS ŒÂñ? ÁÁOŽ G < 1? 2Á G 1 = 0.7 < 1. JacobiS Âñ. Ð Šx (0) = (1, 1, 1) T OŽ(JXeµ k x (k) 1 x (k) 2 x (k) 3 X (k) X (k 1) 0 1 1 1 1-1.5 1.6 0.9 0.6 2-1.25 2.08 1.09 0.48 3-0.915 2.068 1.017 0.355 4-0.9575 1.9864 0.9847 0.0425 5-1.01445 1.98844 0.99711 0.05695 6-1.00722 2.00231 1.0026 0.00723 7-0.997543 2.00197 1.00049 0.009677

Theorem (on Convergence of Jacobi Method ) If A is diagonally dominant, then the sequence produced by the Jacobi iteration converges to the solution of Ax = b for any starting vector.

Theorem (on Convergence of Jacobi Method ) If A is diagonally dominant, then the sequence produced by the Jacobi iteration converges to the solution of Ax = b for any starting vector. Proof. Diagonal dominance means that a ii > n j=1,j i It s easy to compute that I D 1 A = max a ij (1 i n) n 1 i n j=1,j i a ij a ii < 1

Example Discuss whether the Jacobi method is successful for the problem 2 1 0 0 2 1 2 1 0 2 Ax = x = = b 0 1 2 1 2 0 0 1 2 2

Solution.

Solution. Splitting matrix Q = D = diag(a) and iteration matrix G 0 1 0 0 2 G = I Q 1 A = I D 1 1 A = 0 1 0 2 2 0 1 0 1 2 2 0 0 1 0 2

Solution. Splitting matrix Q = D = diag(a) and iteration matrix G 0 1 0 0 2 G = I Q 1 A = I D 1 1 A = 0 1 0 2 2 0 1 0 1 2 2 0 0 1 0 2 Check G 1 = 1,

Solution. Splitting matrix Q = D = diag(a) and iteration matrix G 0 1 0 0 2 G = I Q 1 A = I D 1 1 A = 0 1 0 2 2 0 1 0 1 2 2 0 0 1 0 2 Check G 1 = 1, G = 1.

Solution. Splitting matrix Q = D = diag(a) and iteration matrix G 0 1 0 0 2 G = I Q 1 A = I D 1 1 A = 0 1 0 2 2 0 1 0 1 2 2 0 0 1 0 2 Check G 1 = 1, G = 1. Recall that ρ(g) G and G 2 = ρ(g) if G is real and symmetric.

0 = λi G = det λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ

0 = λi G = det λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 0 = = λ 4 3 4 λ2 + 1 16 λ2 = 3 8 ± 5 8 < 1 ρ(g) < 1 G 2 = ρ(g) < 1

0 = λi G = det λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 0 = = λ 4 3 4 λ2 + 1 16 λ2 = 3 8 ± 5 8 < 1 ρ(g) < 1 G 2 = ρ(g) < 1 Conclusion: The Jacobi method works!

Example Discuss whether the Richardson method is successful for the problem 2 1 0 0 2 1 2 1 0 2 Ax = x = = b 0 1 2 1 2 0 0 1 2 2

Example Discuss whether the Richardson method is successful for the problem 2 1 0 0 2 1 2 1 0 2 Ax = x = = b 0 1 2 1 2 0 0 1 2 Recall that in the Richardson method, the splitting matrix Q = I and iteration matrix G = I Q 1 A = I A 2

Recall that the spectral radius of A is { } ρ(a) = max λ : det(a λi) = 0

Recall that the spectral radius of A is { } ρ(a) = max λ : det(a λi) = 0 Theorem (on Similar Upper Triangular Matrices) Every square matrix is similar to an (possibly complex) upper triangular matrix whose off-diagonal elements are arbitrarily small.

Theorem (on Spectral Radius ) The spectral radius function satisfies the equation ρ(a) = inf A in which the infimum is taken over all subordinate matrix norms.

Theorem (on Spectral Radius ) The spectral radius function satisfies the equation ρ(a) = inf A in which the infimum is taken over all subordinate matrix norms. Proof. Hint. Note that ρ(a) A, ( )

Theorem (on Spectral Radius ) The spectral radius function satisfies the equation ρ(a) = inf A in which the infimum is taken over all subordinate matrix norms. Proof. Hint. Note that ρ(a) A, ( ) ρ(a) inf A

Remark This theorem tells us that for any matrix A, its spectral radius is a lower bound for any subordinate matrix norm and moreover, a subordinate matrix norm exists with a value arbitrarily close to the spectral radius, i.e., ɛ > 0, there is a subordinate matrix norm ɛ s.t.

Theorem (on Necessary and Sufficient Conditions for Iterative Method Convergence) For the iterative formula x (k) = Gx (k 1) + c to produce a sequence converging to (I G) 1 c, for any vector c and any starting vector x (0), it is necessary and sufficient that the spectral radius of G be less than 1, i.e., ρ(g) < 1.

Proof. ( )

Proof. ( ) Suppose that ρ(g) < 1.

Proof. ( ) Suppose that ρ(g) < 1. By Theorem on Spectral Radius, there is a subordinate matrix norm s.t. G < 1.

Proof. ( ) Suppose that ρ(g) < 1. By Theorem on Spectral Radius, there is a subordinate matrix norm s.t. G < 1. We write x (1) x (2) x (3) = Gx (0) + c = Gx (1) + c = G 2 x (0) + Gc + c = G 3 x (0) + G 2 c + Gc + c

continued... Then G k x (0) G k x (0) G k x (0) 0 as k.

continued... Then G k x (0) G k x (0) G k x (0) 0 as k. By the Theorem on Neumann Series (P198), we have G j c = (I G) 1 c j=0 Thus, by letting k in (3),

continued... Then G k x (0) G k x (0) G k x (0) 0 as k. By the Theorem on Neumann Series (P198), we have G j c = (I G) 1 c j=0 Thus, by letting k in (3), we obtain lim x (k) = (I G) 1 c k

(continued...) ( )

(continued...) ( ) For the converse, suppose that ρ(g) 1.

(continued...) ( ) For the converse, suppose that ρ(g) 1. Select u and λ s.t. Gu = λu λ 1 u 0

(continued...) ( ) For the converse, suppose that ρ(g) 1. Select u and λ s.t. Gu = λu λ 1 u 0 Let c = u and x (0) = 0. By Equation (3), k 1 k 1 x (k) = G j u = λ j u j=0 j=0

(continued...) ( ) For the converse, suppose that ρ(g) 1. Select u and λ s.t. Gu = λu λ 1 u 0 Let c = u and x (0) = 0. By Equation (3), k 1 k 1 x (k) = G j u = λ j u j=0 j=0 If λ = 1, x (k) = ku, it clearly diverges as k.

Corollary (Iterative Method Convergence Corollary) The iterative formula Qx (k) = (Q A)x (k 1) + b (k 1) will produce a sequence converging to the solution of Ax = b, for any starting vector x (0), if ρ(i Q 1 A) < 1.

Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A.

Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A. Gauss-Seidel Method

Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A. Gauss-Seidel Method In Gauss-Seidel Method, the splitting matrix Q = D + L and the iterative matrix G = (D + L) 1 U

Gauss-SeidelS ª: þ/ª9ù3^ k 5 ê Ax = b a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2. a n1 x 1 + a n2 x 2 + + a nn x n = b n

Gauss-SeidelS ª: þ/ª9ù3^ k 5 ê Ax = b a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2. a n1 x 1 + a n2 x 2 + + a nn x n = b n Gauss-SeidelS ª þ/ª

Gauss-SeidelS ª: þ/ª9ù3^ k 5 ê Ax = b a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2. a n1 x 1 + a n2 x 2 + + a nn x n = b n Gauss-SeidelS ª þ/ª x (k) 1 = 1 a 11 (a 12 x (k 1) 2 + + a 1n x (k 1) n b 1 ) x (k) 2 = 1 a 22 (a 21 x (k) 1 + a 23 x (k 1) 3 + + a 2n x (k 1) n b 2 ) x (k) 3 = 1 a 33 (a 31 x (k) 1 + a 32 x (k) 2 + a 34 x (k 1) 4 + + a 3,n x (k 1) n b 3 ). x n (k) = 1 a nn (a n1 x 1 (k) + a n2 x 2 (k) + + a n,n 1 x n 1 (k) b n )

~µ^gauss-seidels {) ( ) 1 8 0 A = 1 0 9 9 1 1 ÐŠ x (0) = (0, 0, 0) T. ): Ax = b,ù ), b = ( 7 8 7.

~µ^gauss-seidels {) ( ) 1 8 0 A = 1 0 9 9 1 1 ÐŠ x (0) = (0, 0, 0) T. Ax = b,ù ), b = ( 7 8 7 ): Ïé a 22 = 0, IŠý?n ^ S µ.

~µ^gauss-seidels {) ( ) 1 8 0 A = 1 0 9 9 1 1 ÐŠ x (0) = (0, 0, 0) T. Ax = b,ù ), b = ( 7 8 7 ): Ïé a 22 = 0, IŠý?n ^ S µ ( ) ( ) 9 1 1 77 A 1 8 0, b 1 0 9 8.

S ª µ x (k) 1 = 1 (k 1) (x 9 2 + x (k 1) 3 + 7) x (k) 2 = 1 (k) (x 8 1 + 7) x (k) 3 = 1 (k) (x 9 1 + 8)

S ª µ x (k) 1 = 1 (k 1) (x 9 2 + x (k 1) 3 + 7) x (k) 2 = 1 (k) (x 8 1 + 7) x (k) 3 = 1 (k) (x 9 1 + 8) S k = 4Ú,µ x (1) = (0.7778, 0.9722, 0.9753) x (2) = (0.9942, 0.9993, 0.9994) x (3) = (0.9999, 0.9999, 0.9999) x (4) = (1.0000, 1.0000, 1.0000)

S ª µ x (k) 1 = 1 (k 1) (x 9 2 + x (k 1) 3 + 7) x (k) 2 = 1 (k) (x 8 1 + 7) x (k) 3 = 1 (k) (x 9 1 + 8) S k = 4Ú,µ ()? x (1) = (0.7778, 0.9722, 0.9753) x (2) = (0.9942, 0.9993, 0.9994) x (3) = (0.9999, 0.9999, 0.9999) x (4) = (1.0000, 1.0000, 1.0000)

Theorem (on Gauss-Seidel Method Convergence) If A is diagonally dominant, then the Gauss-Seidel Method converges for any starting vector.

Proof. It suffices to prove that ρ(i Q 1 A) < 1.

Proof. It suffices to prove that ρ(i Q 1 A) < 1. Let λ be any eigenvalue of I Q 1 A and x be a corresponding eigenvector. Assume, WLOG, that x = 1. We have (I Q 1 A)x = λx or Qx Ax = λqx

Proof. It suffices to prove that ρ(i Q 1 A) < 1. Let λ be any eigenvalue of I Q 1 A and x be a corresponding eigenvector. Assume, WLOG, that x = 1. We have (I Q 1 A)x = λx or Qx Ax = λqx Since the splitting matrix Q is the lower triangular part of A, including its diagonal,

Proof. Then λa ii x i = n j=i+1 i 1 a ij x j λ a ij x j (1 i n) j=1

Proof. Then λa ii x i = n j=i+1 i 1 a ij x j λ a ij x j (1 i n) j=1 Choose an index i s.t. x i = 1 x j for all j.

Proof. Then λa ii x i = n j=i+1 i 1 a ij x j λ a ij x j (1 i n) j=1 Choose an index i s.t. x i = 1 x j for all j. Then λ a ii n j=i+1 i 1 a ij + λ a ij (1 i n) j=1

Proof. Then λa ii x i = n j=i+1 i 1 a ij x j λ a ij x j (1 i n) j=1 Choose an index i s.t. x i = 1 x j for all j. Then λ a ii n j=i+1 i 1 a ij + λ a ij (1 i n) Solving for λ and using the diagonal dominance of A, we get { n λ j=i+1 j=1 }{ a ij a ii i 1 j=1 } 1 a ij

~µ O^JacobiS Gauss-seidelS { )Ax = b Ù A = 2 1 1 1 1 1 1 1 2?Ø Âñ5.

~µ O^JacobiS Gauss-seidelS { )Ax = b Ù A = 2 1 1 1 1 1 1 1 2?Ø Âñ5. )µjacobis Ý µ G = I D 1 A = 0 1 1 2 2 1 0 1 1 2 1 0 2

~µ O^JacobiS Gauss-seidelS { )Ax = b Ù A = 2 1 1 1 1 1 1 1 2?Ø Âñ5. )µjacobis Ý µ ÙAõ ª G = I D 1 A = 0 1 1 2 2 1 0 1 1 2 1 0 2 λi G = λ 3 + 5 4 λ = 0 = λ 1 = 0, λ 2,3 = ± 5 2 i

~µ O^JacobiS Gauss-seidelS { )Ax = b Ù A = 2 1 1 1 1 1 1 1 2?Ø Âñ5. )µjacobis Ý µ ÙAõ ª G = I D 1 A = 0 1 1 2 2 1 0 1 1 2 1 0 2 λi G = λ 3 + 5 4 λ = 0 = λ 1 = 0, λ 2,3 = ± Ïρ(G) = 5 2 > 1 5 2 i

~µ O^JacobiS Gauss-seidelS { )Ax = b Ù A = 2 1 1 1 1 1 1 1 2?Ø Âñ5. )µjacobis Ý µ ÙAõ ª G = I D 1 A = 0 1 1 2 2 1 0 1 1 2 1 0 2 λi G = λ 3 + 5 4 λ = 0 = λ 1 = 0, λ 2,3 = ± Ïρ(G) = 5 2 > 1 JacobiS ØÂñ. 5 2 i

e ^Gauss SiedelS { KS Ý µ 1 0 1 G = (D + L) 1 2 2 U = 0 1 1 2 2 0 0 1 2

e ^Gauss SiedelS { KS Ý µ 1 0 1 G = (D + L) 1 2 2 U = 0 1 1 2 2 0 0 1 2 ŒÙAŠ λ 1 = 0, λ 2,3 = 1 2

e ^Gauss SiedelS { KS Ý µ 1 0 1 G = (D + L) 1 2 2 U = 0 1 1 2 2 0 0 1 2 ŒÙAŠ d ρ(g) = 1 2 < 1 λ 1 = 0, λ 2,3 = 1 2

e ^Gauss SiedelS { KS Ý µ 1 0 1 G = (D + L) 1 2 2 U = 0 1 1 2 2 0 0 1 2 ŒÙAŠ λ 1 = 0, λ 2,3 = 1 2 d ρ(g) = 1 < 1 Gauss-Seidel S Âñ. 2

Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A.

Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A. SOR(tµ) Method

Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A. SOR(tµ) Method In SOR Method, the splitting matrix Q = ω 1 D + L and the iterative matrix G = (D + ωl) 1 ((1 ω)d ωu) = I Q 1 A and the iteration formula

SORS ª: þ/ª9ù3^ Recall Gauss-SeidelS þ/ª

SORS ª: þ/ª9ù3^ Recall Gauss-SeidelS þ/ª x (k) 1 = 1 a 11 (a 12 x (k 1) 2 + + a 1n x (k 1) n b 1 ) x (k) 2 = 1 a 22 (a 21 x (k) 1 + a 23 x (k 1) 3 + + a 2n x (k 1) n b 2 ) x (k) 3 = 1 a 33 (a 31 x (k) 1 + a 32 x (k) 2 + a 34 x (k 1) 4 + + a 3,n x (k 1) n b 3 ). x n (k) = 1 a nn (a n1 x 1 (k) + a n2 x 2 (k) + + a n,n 1 x n 1 (k) b n ) SORS þ/ª x (k) 1 = x (k 1) 1 ω (a a 11 x (k 1) 11 1 + a 12 x (k 1) 2 + a 13 x (k 1) 3 + + a 1n x (k 1) n b 1 ) x (k) 2 = x (k 1) 2 ω (a a 21 x (k) 1 + a 22 x (k 1) 22 2 + a 23 x (k 1) 3 + + a 2n x (k 1) n b 2 ) x (k) 3 = x (k 1) 3 ω (a a 31 x (k) 1 + a 32 x (k) 2 + a 33 x (k 1) 33 3 + + a 3n x (k 1) n b 3 ). x n (k) = x (k 1) n ω a nn (a n1 x 1 (k) + a n2 x 2 (k) + a n,n 1 x n 1 (k) + a nnx (k 1) n b n)

Theorem (on SOR Method Convergence) In the SOR method, suppose that the splitting matrix Q is chosen to be αd C, where α is a real parameter, D is any positive definite Hermitian matrix, and C is any matrix satisfying C + C = D A. If A is positive definite Hermitian, if Q is nonsingular and α > 1/2, then the SOR iteration converges for any s- tarting vector.

S {(

d/ª (assume Q 0) S {( Ax = b Qx = (Q A)x + b x = (I Q 1 A)x + Q 1 b

S {( d/ª (assume Q 0) Ax = b Qx = (Q A)x + b x = (I Q 1 A)x + Q 1 b S úª Qx (k) = (Q A)x (k 1) + b x (k) = (I Q 1 A)x (k 1) + Q 1 b

d/ª (assume Q 0) S {( Ax = b Qx = (Q A)x + b x = (I Q 1 A)x + Q 1 b S úª Qx (k) = (Q A)x (k 1) + b x (k) = (I Q 1 A)x (k 1) + Q 1 b Let A = D + L + U and assume 0 < ω < 2. S { Ý Q S Ý G = I Q 1 A Richardson I I A Jacobi D I D 1 A Gauss-Seidel D + L (D + L) 1 U ) 1 SOR(tµS ) ((1 D + L (D + ω ωl) 1 ω)d ωu

Extrapolation ( í)

Extrapolation ( í) Extrapolation is a technique that can be used to improve the convergence properties of a linear iterative process.

Extrapolation ( í) Extrapolation is a technique that can be used to improve the convergence properties of a linear iterative process. Consider the iterative formula x (k) = Gx (k 1) + c (4)

Extrapolation ( í) Extrapolation is a technique that can be used to improve the convergence properties of a linear iterative process. Consider the iterative formula x (k) = Gx (k 1) + c (4) Introduce a parameter γ 0 and embed the above iteration in a one-parameter family of iteration methods given by x (k) = γ(gx (k 1) + c) + (1 γ)x (k 1) = G γ x (k 1) + γc (5) where G γ = γg + (1 γ)i.

Extrapolation If the iteration in (5) converges, say to x, then by taking a limit, we get or x = γ(gx + c) + (1 γ)x x = Gx + c

Extrapolation If the iteration in (5) converges, say to x, then by taking a limit, we get x = γ(gx + c) + (1 γ)x or x = Gx + c Note the iteration in (4) is usually used to produce a sequence to x = Gx + c. If G = I Q 1 A and c = Q 1 b, then it corresponds to solving Ax = b.

Theorem (on Eigenvalues of p(a)) If λ is an eigenvalue of A and if p is a polynomial, then p(λ) is an eigenvalue of p(a).

Theorem (on Eigenvalues of p(a)) If λ is an eigenvalue of A and if p is a polynomial, then p(λ) is an eigenvalue of p(a). Proof. Hint: let Ax = λx with x 0. It s easy to see that A k x = λ k x (k 0) So, p(a)x = m c k A k x = k=0 m c k λ k x = p(λ)x k=0

Remark Suppose we do not know the eigenvalues of G precisely, but we know only an interval, say [a, b], contains all of its eigenvalues.

Remark Suppose we do not know the eigenvalues of G precisely, but we know only an interval, say [a, b], contains all of its eigenvalues. By the theorem, the eigenvalues of G γ = γg + (1 γ)i lie in the interval with endpoints γa + (1 γ) and γb + (1 γ). Denote by Λ(A) the set of eigenvalues of any matrix A. Then ρ(g γ ) = max λ = max γλ + 1 γ λ Λ(G γ ) λ Λ(G) max γλ + 1 γ λ [a,b]

Now, the purpose of extrapolation is

Now, the purpose of extrapolation is min γ ρ(g γ ) = min γ max γλ + 1 γ λ Λ(G)

Now, the purpose of extrapolation is min γ ρ(g γ ) = min γ min γ max λ Λ(G) γλ + 1 γ max γλ + 1 γ λ [a,b]

Now, the purpose of extrapolation is min γ ρ(g γ ) = min γ min γ max λ Λ(G) γλ + 1 γ max γλ + 1 γ < 1 λ [a,b]

Theorem (on Optimal Extrapolation Parameters ) If the only information available about the eigenvalues of G is that they lie in the interval [a, b], and if 1 / [a, b], then the best choice for γ is 2/(2 a b). With this value of γ, ρ(g γ ) 1 γ d, where d is the distance from 1 to [a, b].

Remark The extrapolation process or technique just discussed can be applied to methods that are not convergent themselves. All that is required is that the eigenvalues of G be real and lie in an interval that does not contain 1.

Example Determine the spectral radius of the optimal extrapolated Richardson method.

Example Determine the spectral radius of the optimal extrapolated Richardson method. Hint. In the Richardson method, Q = I, G = I A.

Example Determine the spectral radius of the optimal extrapolated Richardson method. Hint. In the Richardson method, Q = I, G = I A. Example Determine the spectral radius of the optimal extrapolated Jacobi method.

Chebyshev Acceleration

Chebyshev Acceleration Chebyshev Acceleration is an acceleration procedure that tries to use all available information to get a better approximation of the solution of linear equations. As before, consider a basic iteration method x (k) = Gx (k 1) + c (6) Recall that a solution to the problem is a vector x s.t. x = Gx + c. At STEP k in the process, we shall have computed the vectors x (1), x (2),, x (k),

Assume that a (k) 0 + a (k) 1 + + a (k) k = 1 and set k u (k) = a (k) i x (i) i=0

Assume that a (k) 0 + a (k) 1 + + a (k) k = 1 and set k u (k) = a (k) i x (i) Then u (k) x = k i=0 i=0 a (k) i (x (i) x) = k i=0 a (k) i G i (x (0) x)

Assume that a (k) 0 + a (k) 1 + + a (k) k = 1 and set k u (k) = a (k) i x (i) Then u (k) x = k i=0 i=0 a (k) i (x (i) x) = = p(g)(x (0) x) k i=0 where p is the polynomial defined by k p(z) = a (k) i z i (p(1) = 1). i=0 a (k) i G i (x (0) x)

Taking norms, we get u (k) x p(g) x (0) x

Taking norms, we get u (k) x p(g) x (0) x If the eigenvalues µ i of G lie within some bounded set S in the complex plane, then by the previous analysis, ρ(p(g)) = max 1 i n p(µ i) max z S p(z)

Taking norms, we get u (k) x p(g) x (0) x If the eigenvalues µ i of G lie within some bounded set S in the complex plane, then by the previous analysis, ρ(p(g)) = Then, this reduces to max p(µ i) max p(z) 1 i n z S min ρ(p(g)) min max p(z) p P k,p(1)=1 p P k,p(1)=1 z S where P k denotes the set of all real polynomials with degree k.

For example, if S is an interval, say [a, b] R, not containing 1, then a scaled and shifted Shebyshev polynomial can solve this min-max problem.

For example, if S is an interval, say [a, b] R, not containing 1, then a scaled and shifted Shebyshev polynomial can solve this min-max problem. The classic Chebyshev polynomial T k (k 1) is the unique polynomial of degree k with leading coefficient 2 k 1 that minimizes max T k(z) 1 z 1

Now, suppose that the eigenvalues of G are contained in an interval 1 / [a, b], say b < 1.

Now, suppose that the eigenvalues of G are contained in an interval 1 / [a, b], say b < 1. We are interested in min-max problem: min p k P k,p k (1)=1 max p k(z) z [a,b]

Now, suppose that the eigenvalues of G are contained in an interval 1 / [a, b], say b < 1. We are interested in min-max problem: min p k P k,p k (1)=1 max p k(z) z [a,b] The answer to this problem is contained in the four lemmas in P225-227 of the textbook.