Chapter IV Solving Systems of Linear Equations Goal: to construct some general-purpose algorithms for solving systems of linear Equations
4.6 Solution of Equations by Iterative Methods
4.6 Solution of Equations by Iterative Methods The Gaussian algorithm and its variants (C/) are called direct methods ( {) for solving the problem Ax = b. They proceed through a finite number of steps and produce a solution x that would be completely accurate were it not for roundoff errors.
4.6 Solution of Equations by Iterative Methods The Gaussian algorithm and its variants (C/) are called direct methods ( {) for solving the problem Ax = b. They proceed through a finite number of steps and produce a solution x that would be completely accurate were it not for roundoff errors. An indirect method (m{), by contrast, produces a sequence of vectors that ideally converges to the solution.
4.6 Solution of Equations by Iterative Methods The Gaussian algorithm and its variants (C/) are called direct methods ( {) for solving the problem Ax = b. They proceed through a finite number of steps and produce a solution x that would be completely accurate were it not for roundoff errors. An indirect method (m{), by contrast, produces a sequence of vectors that ideally converges to the solution. The computation is halted when an approximate solution is obtained having some specified accuracy or after a certain number of iterations.
4.6 Solution of Equations by Iterative Methods The Gaussian algorithm and its variants (C/) are called direct methods ( {) for solving the problem Ax = b. They proceed through a finite number of steps and produce a solution x that would be completely accurate were it not for roundoff errors. An indirect method (m{), by contrast, produces a sequence of vectors that ideally converges to the solution. The computation is halted when an approximate solution is obtained having some specified accuracy or after a certain number of iterations. Indirect methods are almost always iterative in nature: a simple process is applied repeatedly to generate such sequence.
) 5 küa {µ {µïlk ÚOŽ)nØþ ( OŽE,Ý O(n 3 ) þ? ;þ O(n 2 )þ? 'Ü unœ¹. S {µkø Ó^; m S y{ü cù ^u )Œ.DÕÝ(=Xê Ý ¹kŒþ0ƒ) K.
) 5 küa {µ {µïlk ÚOŽ)nØþ ( OŽE,Ý O(n 3 ) þ? ;þ O(n 2 )þ? 'Ü unœ¹. S {µkø Ó^; m S y{ü cù ^u )Œ.DÕÝ(=Xê Ý ¹kŒþ0ƒ) K. S éõe, K 'Xœht& Ï~ 8( )Œ. n >> 1 DÕÝ 5 ê. dž S { ^ œ
( )S {
( )S { é ( )F(x) = 0 S { ÄÚ½µ
( )S { é ( )F(x) = 0 S { ÄÚ½µ 1 Ed/ª F(x) = 0 x = Φ(x).
( )S { é ( )F(x) = 0 S { ÄÚ½µ 1 Ed/ª F(x) = 0 x = Φ(x). 2 Ü ÐŠ x 0 ÑS ª x (k+1) = Φ(x (k) ).
( )S { é ( )F(x) = 0 S { ÄÚ½µ 1 Ed/ª F(x) = 0 x = Φ(x). 2 Ü ÐŠ x 0 ÑS ª x (k+1) = Φ(x (k) ). 3 e4 x lim k + x (k) 3 Kx ). SOŽ ž S x (k+1) x (k) < εžêž.
( )S { é ( )F(x) = 0 S { ÄÚ½µ 1 Ed/ª F(x) = 0 x = Φ(x). 2 Ü ÐŠ x 0 ÑS ª x (k+1) = Φ(x (k) ). 3 e4 x lim k + x (k) 3 Kx ). SOŽ ž S x (k+1) x (k) < εžêž. e4 lim x (k) Ø3 K }. IE#S ª½ Ð k + Š x (0).
( )S { é ( )F(x) = 0 S { ÄÚ½µ 1 Ed/ª F(x) = 0 x = Φ(x). 2 Ü ÐŠ x 0 ÑS ª x (k+1) = Φ(x (k) ). 3 e4 x lim k + x (k) 3 Kx ). SOŽ ž S x (k+1) x (k) < εžêž. e4 lim x (k) Ø3 K }. IE#S ª½ Ð k + Š x (0). þs x (k) x lim k x (k) x = 0 ^S { ) 5 ê ÄgŽ )š 5 ÄgŽƒÓ.
Consider the equation Ax = b
Consider the equation Ax = b x = Gx + g.
Consider the equation Ax = b x = Gx + g. A certain nonsingular matrix Q, called the splitting matrix, is prescribed, then the problem above is equivalent to Qx = (Q A)x + b
Consider the equation Ax = b x = Gx + g. A certain nonsingular matrix Q, called the splitting matrix, is prescribed, then the problem above is equivalent to Qx = (Q A)x + b This suggests an iterative process, defined by Qx (k) = (Q A)x (k 1) + b (k 1) (1)
Consider the equation Ax = b x = Gx + g. A certain nonsingular matrix Q, called the splitting matrix, is prescribed, then the problem above is equivalent to Qx = (Q A)x + b This suggests an iterative process, defined by or Qx (k) = (Q A)x (k 1) + b (k 1) (1) x (k) = (I Q 1 A)x (k 1) + Q 1 b (k 1) (2) where x (0) is an arbitrary initial vector.
Consider the equation Ax = b x = Gx + g. A certain nonsingular matrix Q, called the splitting matrix, is prescribed, then the problem above is equivalent to Qx = (Q A)x + b This suggests an iterative process, defined by or Qx (k) = (Q A)x (k 1) + b (k 1) (1) x (k) = (I Q 1 A)x (k 1) + Q 1 b (k 1) (2) where x (0) is an arbitrary initial vector. Note that equation (2) is used for theoretical analysis only. In practice, no need to compute Q 1.
Now, our objective is to choose Q ( Q 0) s.t.
Now, our objective is to choose Q ( Q 0) s.t. 1 The Sequence {x (k) } is easily computed. 2 The Sequence {x (k) } converged rapidly to the solution of Ax = b.
Now, our objective is to choose Q ( Q 0) s.t. 1 The Sequence {x (k) } is easily computed. 2 The Sequence {x (k) } converged rapidly to the solution of Ax = b. Obviously, the exact solution x satisfies Qx = (Q A)x + b x = (I Q 1 A)x + Q 1 b
Now, our objective is to choose Q ( Q 0) s.t. 1 The Sequence {x (k) } is easily computed. 2 The Sequence {x (k) } converged rapidly to the solution of Ax = b. Obviously, the exact solution x satisfies Qx = (Q A)x + b x = (I Q 1 A)x + Q 1 b Clearly x is a fixed point of F(x) (I Q 1 A)x + Q 1 b.
Now, our objective is to choose Q ( Q 0) s.t. 1 The Sequence {x (k) } is easily computed. 2 The Sequence {x (k) } converged rapidly to the solution of Ax = b. Obviously, the exact solution x satisfies Qx = (Q A)x + b x = (I Q 1 A)x + Q 1 b Clearly x is a fixed point of F(x) (I Q 1 A)x + Q 1 b. Then, x (k) x = (I Q 1 A)(x (k 1) x) (k 1)
Now, our objective is to choose Q ( Q 0) s.t. 1 The Sequence {x (k) } is easily computed. 2 The Sequence {x (k) } converged rapidly to the solution of Ax = b. Obviously, the exact solution x satisfies Qx = (Q A)x + b x = (I Q 1 A)x + Q 1 b Clearly x is a fixed point of F(x) (I Q 1 A)x + Q 1 b. Then, x (k) x = (I Q 1 A)(x (k 1) x) (k 1) Then, x (k) x I Q 1 A x (k 1) x (k 1)
Now, our objective is to choose Q ( Q 0) s.t. 1 The Sequence {x (k) } is easily computed. 2 The Sequence {x (k) } converged rapidly to the solution of Ax = b. Obviously, the exact solution x satisfies Qx = (Q A)x + b x = (I Q 1 A)x + Q 1 b Clearly x is a fixed point of F(x) (I Q 1 A)x + Q 1 b. Then, x (k) x = (I Q 1 A)(x (k 1) x) (k 1) Then, x (k) x I Q 1 A x (k 1) x (k 1) x (k) x I Q 1 A k x (0) x (k 1)
Thus, if I Q 1 A < 1, one can conclude that lim x (k) x = 0 k
Thus, if I Q 1 A < 1, one can conclude that lim x (k) x = 0 k Observe that I Q 1 A < 1 implies the invertibility of Q 1 and A. Hence, we have:
Thus, if I Q 1 A < 1, one can conclude that lim x (k) x = 0 k Observe that I Q 1 A < 1 implies the invertibility of Q 1 and A. Hence, we have: Theorem (on Iterative Method Convergence) If I Q 1 A < 1 for some subordinate matrix norm, then the sequence produced by (1) converges to the solution of Ax = b for any initial vector x (0).
Remark The matrix G I Q 1 A is usually called the iteration matrix. If δ I Q 1 A < 1, then we can use the stopping condition for the iterative method as follows: x (k) x where ɛ is the tolerance. δ 1 δ x (k) x (k 1) < ɛ
Richardson Method
Richardson Method The splitting matrix is Q = I and the iteration matrix is G = I Q 1 A = I A. So, the iteration formula is x (k) = (I A)x (k 1) + b = x (k 1) + r (k 1) where r (k 1) is the residual vector, defined by r (k 1) = b Ax (k 1).
Richardson Method The splitting matrix is Q = I and the iteration matrix is G = I Q 1 A = I A. So, the iteration formula is x (k) = (I A)x (k 1) + b = x (k 1) + r (k 1) where r (k 1) is the residual vector, defined by r (k 1) = b Ax (k 1). With the preceding theorem, one can easily show that if I A < 1, then the sequence x (k) generated by the Richardson iteration will converge to a solution to Ax = b.
Exercise (in class) Find or write down the explicit form for the iteration matrix G = I Q 1 A and state the iteration formula in the Richardson method for the problem Ax = 1 1 2 1 3 1 3 1 1 2 1 2 1 3 1 x 1 x 2 x 3 = 11 18 11 18 11 18 Show that the Richardson method is successful (i.e., x (k) A 1 b ) for this problem.
For example, with initial guess x (0) = (0, 0, 0) T, the sequence {x (k) } generated by the Richardson method is: x (0) x (1) x (10) = (0.00000, 0.00000, 0.00000) T = (0.61111, 0.61111, 0.61111) T. = (0.27950, 0.27950, 0.27950) T.
For example, with initial guess x (0) = (0, 0, 0) T, the sequence {x (k) } generated by the Richardson method is: x (0) x (1) x (10) x (40) = (0.00000, 0.00000, 0.00000) T = (0.61111, 0.61111, 0.61111) T. = (0.27950, 0.27950, 0.27950) T. = (0.33311, 0.33311, 0.33311) T.
For example, with initial guess x (0) = (0, 0, 0) T, the sequence {x (k) } generated by the Richardson method is: x (0) x (1) x (10) x (40) x (80) = (0.00000, 0.00000, 0.00000) T = (0.61111, 0.61111, 0.61111) T. = (0.27950, 0.27950, 0.27950) T. = (0.33311, 0.33311, 0.33311) T. = (0.33333, 0.33333, 0.33333) T
Example Discuss whether the Richardson method is successful for the problem Ax = 1 1 2 0 0 1 2 1 1 2 0 0 1 2 1 1 2 0 0 1 2 1 x = 1 1 1 1
Solution.
Solution. Splitting matrix Q = I and iteration matrix G 0 1 0 0 2 G = I Q 1 1 A = I A = 0 1 0 2 2 0 1 0 1 2 2 0 0 1 0 2
Solution. Splitting matrix Q = I and iteration matrix G 0 1 0 0 2 G = I Q 1 1 A = I A = 0 1 0 2 2 0 1 0 1 2 2 0 0 1 0 2 Check G 1 = 1,
Solution. Splitting matrix Q = I and iteration matrix G 0 1 0 0 2 G = I Q 1 1 A = I A = 0 1 0 2 2 0 1 0 1 2 2 0 0 1 0 2 Check G 1 = 1, G = 1
Solution. Splitting matrix Q = I and iteration matrix G 0 1 0 0 2 G = I Q 1 1 A = I A = 0 1 0 2 2 0 1 0 1 2 2 0 0 1 0 2 Check G 1 = 1, G = 1 Recall ρ(g) G.
0 = λi G = det λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ
0 = λi G = det λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 0 = = λ 4 3 4 λ2 + 1 16
0 = λi G = det λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 0 = = λ 4 3 4 λ2 + 1 16 λ2 = 3 8 ± 5 8
0 = λi G = det λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 0 = = λ 4 3 4 λ2 + 1 16 λ2 = 3 8 ± 5 8 < 1
0 = λi G = det λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 0 = = λ 4 3 4 λ2 + 1 16 λ2 = 3 8 ± 5 8 < 1 ρ(g) < 1 G 2 = ρ(g) < 1
0 = λi G = det λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 0 = = λ 4 3 4 λ2 + 1 16 λ2 = 3 8 ± 5 8 < 1 ρ(g) < 1 G 2 = ρ(g) < 1 Conclusion: The Richardson method works!
Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A.
Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A. Jacobi Method
Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A. Jacobi Method In Jacobi Method, the splitting matrix Q = D and the iterative matrix G = D 1 (L + U) = I Q 1 A
Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A. Jacobi Method In Jacobi Method, the splitting matrix Q = D and the iterative matrix G = D 1 (L + U) = I Q 1 A The iteration formula is Dx (k) = (L + U)x (k 1) + b
Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A. Jacobi Method In Jacobi Method, the splitting matrix Q = D and the iterative matrix G = D 1 (L + U) = I Q 1 A The iteration formula is Dx (k) = (L + U)x (k 1) + b or Qx (k) = (Q A)x (k 1) + b (k 1)
JacobiS ª: þ/ª9ù3^ k 5 ê Ax = b a 11 x 1 + a 12 x 2 + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + a 2n x n = b 2. a n1 x 1 + a n2 x 2 + a nn x n = b n
JacobiS ª: þ/ª9ù3^ k 5 ê Ax = b a 11 x 1 + a 12 x 2 + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + a 2n x n = b 2. a n1 x 1 + a n2 x 2 + a nn x n = b n JacobiS ª þ/ª a 11 x (k) 1 = (a 12 x (k 1) 2 + a 13 x (k 1) 3 + + a 1n x (k 1) n b 1 ) a 22 x (k) 2 = (a 21 x (k 1) 1 + a 23 x (k 1) 3 + + a 2n x (k 1) n b 2 ). a nn x n (k) = (a n1 x 1 (k 1) + a n2 x 2 (k 1) + + a n,n 1 x n 1 (k 1) b n )
JacobiS ª: þ/ª9ù3^ k 5 ê Ax = b a 11 x 1 + a 12 x 2 + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + a 2n x n = b 2. a n1 x 1 + a n2 x 2 + a nn x n = b n JacobiS ª þ/ª a 11 x (k) 1 = (a 12 x (k 1) 2 + a 13 x (k 1) 3 + + a 1n x (k 1) n b 1 ) a 22 x (k) 2 = (a 21 x (k 1) 1 + a 23 x (k 1) 3 + + a 2n x (k 1) n b 2 ). a nn x n (k) = (a n1 x 1 (k 1) + a n2 x 2 (k 1) + + a n,n 1 x n 1 (k 1) b n ) JacobiS ª3(=k Â)^ º
~µ ^Jacobi {)e 2x 1 x 2 x 3 = 5 x 1 + 5x 2 x 3 = 8 x 1 + x 2 + 10x 3 = 11 )µ
~µ ^Jacobi {)e 2x 1 x 2 x 3 = 5 x 1 + 5x 2 x 3 = 8 x 1 + x 2 + 10x 3 = 11 )µ ÝQ = diag{2, 5, 10}, ƒajacobis ª þ/ªµ x (k) 1 = 0.5x (k 1) 2 + 0.5x (k 1) 3 2.5 x (k) 2 = 0.2x (k 1) 1 + 0.2x (k 1) 3 + 1.6 x (k) 3 = 0.1x (k 1) 1 0.1x (k 1) 2 + 1.1
~µ ^Jacobi {)e 2x 1 x 2 x 3 = 5 x 1 + 5x 2 x 3 = 8 x 1 + x 2 + 10x 3 = 11 )µ ÝQ = diag{2, 5, 10}, ƒajacobis ª þ/ªµ x (k) 1 x (k) 2 x (k) 3 x (k) 1 = 0.5x (k 1) 2 + 0.5x (k 1) 3 2.5 x (k) 2 = 0.2x (k 1) 1 + 0.2x (k 1) 3 + 1.6 x (k) 3 = 0.1x (k 1) 1 0.1x (k 1) 2 + 1.1 = 0 0.5 0.5 0.2 0 0.2 0.1 0.1 0 x (k 1) 1 x (k 1) 2 x (k 1) 3 + 2.5 1.6 1.1
JacobiS ŒÂñ?
JacobiS ŒÂñ? ÁÁOŽ G < 1?
JacobiS ŒÂñ? ÁÁOŽ G < 1? 2Á G 1 = 0.7 < 1.
JacobiS ŒÂñ? ÁÁOŽ G < 1? 2Á G 1 = 0.7 < 1. JacobiS Âñ.
JacobiS ŒÂñ? ÁÁOŽ G < 1? 2Á G 1 = 0.7 < 1. JacobiS Âñ. Ð Šx (0) = (1, 1, 1) T OŽ(JXeµ k x (k) 1 x (k) 2 x (k) 3 X (k) X (k 1) 0 1 1 1 1-1.5 1.6 0.9 0.6 2-1.25 2.08 1.09 0.48 3-0.915 2.068 1.017 0.355 4-0.9575 1.9864 0.9847 0.0425 5-1.01445 1.98844 0.99711 0.05695 6-1.00722 2.00231 1.0026 0.00723 7-0.997543 2.00197 1.00049 0.009677
JacobiS ŒÂñ? ÁÁOŽ G < 1? 2Á G 1 = 0.7 < 1. JacobiS Âñ. Ð Šx (0) = (1, 1, 1) T OŽ(JXeµ k x (k) 1 x (k) 2 x (k) 3 X (k) X (k 1) 0 1 1 1 1-1.5 1.6 0.9 0.6 2-1.25 2.08 1.09 0.48 3-0.915 2.068 1.017 0.355 4-0.9575 1.9864 0.9847 0.0425 5-1.01445 1.98844 0.99711 0.05695 6-1.00722 2.00231 1.0026 0.00723 7-0.997543 2.00197 1.00049 0.009677 () x = ( 1, 2, 1) T.
Theorem (on Convergence of Jacobi Method ) If A is diagonally dominant, then the sequence produced by the Jacobi iteration converges to the solution of Ax = b for any starting vector.
Theorem (on Convergence of Jacobi Method ) If A is diagonally dominant, then the sequence produced by the Jacobi iteration converges to the solution of Ax = b for any starting vector. Proof. Diagonal dominance means that a ii > n j=1,j i a ij (1 i n)
Theorem (on Convergence of Jacobi Method ) If A is diagonally dominant, then the sequence produced by the Jacobi iteration converges to the solution of Ax = b for any starting vector. Proof. Diagonal dominance means that a ii > n j=1,j i It s easy to compute that I D 1 A = max a ij (1 i n) n 1 i n j=1,j i a ij a ii < 1
Theorem (on Convergence of Jacobi Method ) If A is diagonally dominant, then the sequence produced by the Jacobi iteration converges to the solution of Ax = b for any starting vector. Proof. Diagonal dominance means that a ii > n j=1,j i It s easy to compute that I D 1 A = max a ij (1 i n) n 1 i n j=1,j i a ij a ii < 1 By the preceding theorem, the Jacobi iteration converges.
Example Discuss whether the Jacobi method is successful for the problem 2 1 0 0 2 1 2 1 0 2 Ax = x = = b 0 1 2 1 2 0 0 1 2 2
Solution.
Solution. Splitting matrix Q = D = diag(a) and iteration matrix G 0 1 0 0 2 G = I Q 1 A = I D 1 1 A = 0 1 0 2 2 0 1 0 1 2 2 0 0 1 0 2
Solution. Splitting matrix Q = D = diag(a) and iteration matrix G 0 1 0 0 2 G = I Q 1 A = I D 1 1 A = 0 1 0 2 2 0 1 0 1 2 2 0 0 1 0 2 Check G 1 = 1,
Solution. Splitting matrix Q = D = diag(a) and iteration matrix G 0 1 0 0 2 G = I Q 1 A = I D 1 1 A = 0 1 0 2 2 0 1 0 1 2 2 0 0 1 0 2 Check G 1 = 1, G = 1.
Solution. Splitting matrix Q = D = diag(a) and iteration matrix G 0 1 0 0 2 G = I Q 1 A = I D 1 1 A = 0 1 0 2 2 0 1 0 1 2 2 0 0 1 0 2 Check G 1 = 1, G = 1. Recall that ρ(g) G and G 2 = ρ(g) if G is real and symmetric.
0 = λi G = det λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ
0 = λi G = det λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 0 = = λ 4 3 4 λ2 + 1 16 λ2 = 3 8 ± 5 8 < 1 ρ(g) < 1 G 2 = ρ(g) < 1
0 = λi G = det λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 1 2 0 0 1 2 λ 0 = = λ 4 3 4 λ2 + 1 16 λ2 = 3 8 ± 5 8 < 1 ρ(g) < 1 G 2 = ρ(g) < 1 Conclusion: The Jacobi method works!
Example Discuss whether the Richardson method is successful for the problem 2 1 0 0 2 1 2 1 0 2 Ax = x = = b 0 1 2 1 2 0 0 1 2 2
Example Discuss whether the Richardson method is successful for the problem 2 1 0 0 2 1 2 1 0 2 Ax = x = = b 0 1 2 1 2 0 0 1 2 Recall that in the Richardson method, the splitting matrix Q = I and iteration matrix G = I Q 1 A = I A 2
Recall that the spectral radius of A is { } ρ(a) = max λ : det(a λi) = 0
Recall that the spectral radius of A is { } ρ(a) = max λ : det(a λi) = 0 Theorem (on Similar Upper Triangular Matrices) Every square matrix is similar to an (possibly complex) upper triangular matrix whose off-diagonal elements are arbitrarily small.
Recall that the spectral radius of A is { } ρ(a) = max λ : det(a λi) = 0 Theorem (on Similar Upper Triangular Matrices) Every square matrix is similar to an (possibly complex) upper triangular matrix whose off-diagonal elements are arbitrarily small. Proof. Hint. Use the Schur s theorem in Section 5.2. See your book in P214.
Theorem (on Spectral Radius ) The spectral radius function satisfies the equation ρ(a) = inf A in which the infimum is taken over all subordinate matrix norms.
Theorem (on Spectral Radius ) The spectral radius function satisfies the equation ρ(a) = inf A in which the infimum is taken over all subordinate matrix norms. Proof. Hint. Note that ρ(a) A, ( )
Theorem (on Spectral Radius ) The spectral radius function satisfies the equation ρ(a) = inf A in which the infimum is taken over all subordinate matrix norms. Proof. Hint. Note that ρ(a) A, ( ) ρ(a) inf A
Theorem (on Spectral Radius ) The spectral radius function satisfies the equation ρ(a) = inf A in which the infimum is taken over all subordinate matrix norms. Proof. Hint. Note that ρ(a) A, ( ) ρ(a) inf A and then use the preceding theorem. See also your book in P214.
Remark This theorem tells us that for any matrix A, its spectral radius is a lower bound for any subordinate matrix norm and moreover, a subordinate matrix norm exists with a value arbitrarily close to the spectral radius, i.e., ɛ > 0, there is a subordinate matrix norm ɛ s.t.
Remark This theorem tells us that for any matrix A, its spectral radius is a lower bound for any subordinate matrix norm and moreover, a subordinate matrix norm exists with a value arbitrarily close to the spectral radius, i.e., ɛ > 0, there is a subordinate matrix norm ɛ s.t. ρ(a) A ɛ ρ(a) + ɛ.
Remark This theorem tells us that for any matrix A, its spectral radius is a lower bound for any subordinate matrix norm and moreover, a subordinate matrix norm exists with a value arbitrarily close to the spectral radius, i.e., ɛ > 0, there is a subordinate matrix norm ɛ s.t. ρ(a) A ɛ ρ(a) + ɛ. Particularly, if ρ(a) < 1, then there is a subordinate matrix norm s.t. A < 1.
Theorem (on Necessary and Sufficient Conditions for Iterative Method Convergence) For the iterative formula x (k) = Gx (k 1) + c to produce a sequence converging to (I G) 1 c, for any vector c and any starting vector x (0), it is necessary and sufficient that the spectral radius of G be less than 1, i.e., ρ(g) < 1.
Proof. ( )
Proof. ( ) Suppose that ρ(g) < 1.
Proof. ( ) Suppose that ρ(g) < 1. By Theorem on Spectral Radius, there is a subordinate matrix norm s.t. G < 1.
Proof. ( ) Suppose that ρ(g) < 1. By Theorem on Spectral Radius, there is a subordinate matrix norm s.t. G < 1. We write x (1) x (2) x (3) = Gx (0) + c = Gx (1) + c = G 2 x (0) + Gc + c = G 3 x (0) + G 2 c + Gc + c
Proof. ( ) Suppose that ρ(g) < 1. By Theorem on Spectral Radius, there is a subordinate matrix norm s.t. G < 1. We write x (1) x (2) x (3) = Gx (0) + c = Gx (1) + c = G 2 x (0) + Gc + c = G 3 x (0) + G 2 c + Gc + c The general formula is k 1 x (k) = G k x (0) + G j c (3) j=0
continued... Then G k x (0) G k x (0) G k x (0) 0 as k.
continued... Then G k x (0) G k x (0) G k x (0) 0 as k. By the Theorem on Neumann Series (P198), we have G j c = (I G) 1 c j=0 Thus, by letting k in (3),
continued... Then G k x (0) G k x (0) G k x (0) 0 as k. By the Theorem on Neumann Series (P198), we have G j c = (I G) 1 c j=0 Thus, by letting k in (3), we obtain lim x (k) = (I G) 1 c k
(continued...) ( )
(continued...) ( ) For the converse, suppose that ρ(g) 1.
(continued...) ( ) For the converse, suppose that ρ(g) 1. Select u and λ s.t. Gu = λu λ 1 u 0
(continued...) ( ) For the converse, suppose that ρ(g) 1. Select u and λ s.t. Gu = λu λ 1 u 0 Let c = u and x (0) = 0. By Equation (3), k 1 k 1 x (k) = G j u = λ j u j=0 j=0
(continued...) ( ) For the converse, suppose that ρ(g) 1. Select u and λ s.t. Gu = λu λ 1 u 0 Let c = u and x (0) = 0. By Equation (3), k 1 k 1 x (k) = G j u = λ j u j=0 j=0 If λ = 1, x (k) = ku, it clearly diverges as k.
(continued...) ( ) For the converse, suppose that ρ(g) 1. Select u and λ s.t. Gu = λu λ 1 u 0 Let c = u and x (0) = 0. By Equation (3), k 1 k 1 x (k) = G j u = λ j u j=0 j=0 If λ = 1, x (k) = ku, it clearly diverges as k. If λ 1, x (k) = (λ k 1)(λ 1) 1 u, it also diverges since lim k λ k does not exist.
Corollary (Iterative Method Convergence Corollary) The iterative formula Qx (k) = (Q A)x (k 1) + b (k 1) will produce a sequence converging to the solution of Ax = b, for any starting vector x (0), if ρ(i Q 1 A) < 1.
Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A.
Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A. Gauss-Seidel Method
Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A. Gauss-Seidel Method In Gauss-Seidel Method, the splitting matrix Q = D + L and the iterative matrix G = (D + L) 1 U
Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A. Gauss-Seidel Method In Gauss-Seidel Method, the splitting matrix Q = D + L and the iterative matrix G = (D + L) 1 U = I Q 1 A
Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A. Gauss-Seidel Method In Gauss-Seidel Method, the splitting matrix Q = D + L and the iterative matrix G = (D + L) 1 U = I Q 1 A So, the iteration formula is (D + L)x (k) = Ux (k 1) + b
Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A. Gauss-Seidel Method In Gauss-Seidel Method, the splitting matrix Q = D + L and the iterative matrix G = (D + L) 1 U = I Q 1 A So, the iteration formula is (D + L)x (k) = Ux (k 1) + b or Qx (k) = (Q A)x (k 1) + b (k 1)
Gauss-SeidelS ª: þ/ª9ù3^ k 5 ê Ax = b a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2. a n1 x 1 + a n2 x 2 + + a nn x n = b n
Gauss-SeidelS ª: þ/ª9ù3^ k 5 ê Ax = b a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2. a n1 x 1 + a n2 x 2 + + a nn x n = b n Gauss-SeidelS ª þ/ª
Gauss-SeidelS ª: þ/ª9ù3^ k 5 ê Ax = b a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2. a n1 x 1 + a n2 x 2 + + a nn x n = b n Gauss-SeidelS ª þ/ª x (k) 1 = 1 a 11 (a 12 x (k 1) 2 + + a 1n x (k 1) n b 1 ) x (k) 2 = 1 a 22 (a 21 x (k) 1 + a 23 x (k 1) 3 + + a 2n x (k 1) n b 2 ) x (k) 3 = 1 a 33 (a 31 x (k) 1 + a 32 x (k) 2 + a 34 x (k 1) 4 + + a 3,n x (k 1) n b 3 ). x n (k) = 1 a nn (a n1 x 1 (k) + a n2 x 2 (k) + + a n,n 1 x n 1 (k) b n )
Gauss-SeidelS ª: þ/ª9ù3^ k 5 ê Ax = b a 11 x 1 + a 12 x 2 + + a 1n x n = b 1 a 21 x 1 + a 22 x 2 + + a 2n x n = b 2. a n1 x 1 + a n2 x 2 + + a nn x n = b n Gauss-SeidelS ª þ/ª x (k) 1 = 1 a 11 (a 12 x (k 1) 2 + + a 1n x (k 1) n b 1 ) x (k) 2 = 1 a 22 (a 21 x (k) 1 + a 23 x (k 1) 3 + + a 2n x (k 1) n b 2 ) x (k) 3 = 1 a 33 (a 31 x (k) 1 + a 32 x (k) 2 + a 34 x (k 1) 4 + + a 3,n x (k 1) n b 3 ). x n (k) = 1 a nn (a n1 x 1 (k) + a n2 x 2 (k) + + a n,n 1 x n 1 (k) b n ) Gauss-SeidelS ª3(=k Â)^ º
~µ^gauss-seidels {) ( ) 1 8 0 A = 1 0 9 9 1 1 Њ x (0) = (0, 0, 0) T. ): Ax = b,ù ), b = ( 7 8 7.
~µ^gauss-seidels {) ( ) 1 8 0 A = 1 0 9 9 1 1 Њ x (0) = (0, 0, 0) T. Ax = b,ù ), b = ( 7 8 7 ): Ïé a 22 = 0, IŠý?n ^ S µ.
~µ^gauss-seidels {) ( ) 1 8 0 A = 1 0 9 9 1 1 Њ x (0) = (0, 0, 0) T. Ax = b,ù ), b = ( 7 8 7 ): Ïé a 22 = 0, IŠý?n ^ S µ ( ) ( ) 9 1 1 77 A 1 8 0, b 1 0 9 8.
S ª µ x (k) 1 = 1 (k 1) (x 9 2 + x (k 1) 3 + 7) x (k) 2 = 1 (k) (x 8 1 + 7) x (k) 3 = 1 (k) (x 9 1 + 8)
S ª µ x (k) 1 = 1 (k 1) (x 9 2 + x (k 1) 3 + 7) x (k) 2 = 1 (k) (x 8 1 + 7) x (k) 3 = 1 (k) (x 9 1 + 8) S k = 4Ú,µ x (1) = (0.7778, 0.9722, 0.9753) x (2) = (0.9942, 0.9993, 0.9994) x (3) = (0.9999, 0.9999, 0.9999) x (4) = (1.0000, 1.0000, 1.0000)
S ª µ x (k) 1 = 1 (k 1) (x 9 2 + x (k 1) 3 + 7) x (k) 2 = 1 (k) (x 8 1 + 7) x (k) 3 = 1 (k) (x 9 1 + 8) S k = 4Ú,µ ()? x (1) = (0.7778, 0.9722, 0.9753) x (2) = (0.9942, 0.9993, 0.9994) x (3) = (0.9999, 0.9999, 0.9999) x (4) = (1.0000, 1.0000, 1.0000)
Theorem (on Gauss-Seidel Method Convergence) If A is diagonally dominant, then the Gauss-Seidel Method converges for any starting vector.
Proof. It suffices to prove that ρ(i Q 1 A) < 1.
Proof. It suffices to prove that ρ(i Q 1 A) < 1. Let λ be any eigenvalue of I Q 1 A and x be a corresponding eigenvector. Assume, WLOG, that x = 1. We have (I Q 1 A)x = λx or Qx Ax = λqx
Proof. It suffices to prove that ρ(i Q 1 A) < 1. Let λ be any eigenvalue of I Q 1 A and x be a corresponding eigenvector. Assume, WLOG, that x = 1. We have (I Q 1 A)x = λx or Qx Ax = λqx Since the splitting matrix Q is the lower triangular part of A, including its diagonal,
Proof. It suffices to prove that ρ(i Q 1 A) < 1. Let λ be any eigenvalue of I Q 1 A and x be a corresponding eigenvector. Assume, WLOG, that x = 1. We have (I Q 1 A)x = λx or Qx Ax = λqx Since the splitting matrix Q is the lower triangular part of A, including its diagonal, n a ij x j = λ j=i+1 i a ij x j (1 i n) j=1
Proof. It suffices to prove that ρ(i Q 1 A) < 1. Let λ be any eigenvalue of I Q 1 A and x be a corresponding eigenvector. Assume, WLOG, that x = 1. We have (I Q 1 A)x = λx or Qx Ax = λqx Since the splitting matrix Q is the lower triangular part of A, including its diagonal, n a ij x j = λ j=i+1 i a ij x j (1 i n) j=1
Proof. Then λa ii x i = n j=i+1 i 1 a ij x j λ a ij x j (1 i n) j=1
Proof. Then λa ii x i = n j=i+1 i 1 a ij x j λ a ij x j (1 i n) j=1 Choose an index i s.t. x i = 1 x j for all j.
Proof. Then λa ii x i = n j=i+1 i 1 a ij x j λ a ij x j (1 i n) j=1 Choose an index i s.t. x i = 1 x j for all j. Then λ a ii n j=i+1 i 1 a ij + λ a ij (1 i n) j=1
Proof. Then λa ii x i = n j=i+1 i 1 a ij x j λ a ij x j (1 i n) j=1 Choose an index i s.t. x i = 1 x j for all j. Then λ a ii n j=i+1 i 1 a ij + λ a ij (1 i n) Solving for λ and using the diagonal dominance of A, we get { n λ j=i+1 j=1 }{ a ij a ii i 1 j=1 } 1 a ij
Proof. Then λa ii x i = n j=i+1 i 1 a ij x j λ a ij x j (1 i n) j=1 Choose an index i s.t. x i = 1 x j for all j. Then λ a ii n j=i+1 i 1 a ij + λ a ij (1 i n) Solving for λ and using the diagonal dominance of A, we get { n λ j=i+1 j=1 }{ a ij a ii i 1 1 a ij } < 1 j=1
~µ O^JacobiS Gauss-seidelS { )Ax = b Ù A = 2 1 1 1 1 1 1 1 2?Ø Âñ5.
~µ O^JacobiS Gauss-seidelS { )Ax = b Ù A = 2 1 1 1 1 1 1 1 2?Ø Âñ5. )µjacobis Ý µ G = I D 1 A = 0 1 1 2 2 1 0 1 1 2 1 0 2
~µ O^JacobiS Gauss-seidelS { )Ax = b Ù A = 2 1 1 1 1 1 1 1 2?Ø Âñ5. )µjacobis Ý µ ÙAõ ª G = I D 1 A = 0 1 1 2 2 1 0 1 1 2 1 0 2 λi G = λ 3 + 5 4 λ = 0 = λ 1 = 0, λ 2,3 = ± 5 2 i
~µ O^JacobiS Gauss-seidelS { )Ax = b Ù A = 2 1 1 1 1 1 1 1 2?Ø Âñ5. )µjacobis Ý µ ÙAõ ª G = I D 1 A = 0 1 1 2 2 1 0 1 1 2 1 0 2 λi G = λ 3 + 5 4 λ = 0 = λ 1 = 0, λ 2,3 = ± Ïρ(G) = 5 2 > 1 5 2 i
~µ O^JacobiS Gauss-seidelS { )Ax = b Ù A = 2 1 1 1 1 1 1 1 2?Ø Âñ5. )µjacobis Ý µ ÙAõ ª G = I D 1 A = 0 1 1 2 2 1 0 1 1 2 1 0 2 λi G = λ 3 + 5 4 λ = 0 = λ 1 = 0, λ 2,3 = ± Ïρ(G) = 5 2 > 1 JacobiS ØÂñ. 5 2 i
e ^Gauss SiedelS { KS Ý µ 1 0 1 G = (D + L) 1 2 2 U = 0 1 1 2 2 0 0 1 2
e ^Gauss SiedelS { KS Ý µ 1 0 1 G = (D + L) 1 2 2 U = 0 1 1 2 2 0 0 1 2 ŒÙAŠ λ 1 = 0, λ 2,3 = 1 2
e ^Gauss SiedelS { KS Ý µ 1 0 1 G = (D + L) 1 2 2 U = 0 1 1 2 2 0 0 1 2 ŒÙAŠ d ρ(g) = 1 2 < 1 λ 1 = 0, λ 2,3 = 1 2
e ^Gauss SiedelS { KS Ý µ 1 0 1 G = (D + L) 1 2 2 U = 0 1 1 2 2 0 0 1 2 ŒÙAŠ λ 1 = 0, λ 2,3 = 1 2 d ρ(g) = 1 < 1 Gauss-Seidel S Âñ. 2
Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A.
Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A. SOR(tµ) Method
Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A. SOR(tµ) Method In SOR Method, the splitting matrix Q = ω 1 D + L and the iterative matrix G = (D + ωl) 1 ((1 ω)d ωu) = I Q 1 A and the iteration formula
Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A. SOR(tµ) Method In SOR Method, the splitting matrix Q = ω 1 D + L and the iterative matrix G = (D + ωl) 1 ((1 ω)d ωu) = I Q 1 A and the iteration formula (D+ωL)x (k) = ω( Ux (k 1) +b)+(1 ω)dx (k 1) or
Let A = D + L + U where D = diag(a), L is the strictly lower triangular part of A and U is the strictly upper triangular part of A. SOR(tµ) Method In SOR Method, the splitting matrix Q = ω 1 D + L and the iterative matrix G = (D + ωl) 1 ((1 ω)d ωu) = I Q 1 A and the iteration formula (D+ωL)x (k) = ω( Ux (k 1) +b)+(1 ω)dx (k 1) or Qx (k) = (Q A)x (k 1) + b (k 1)
SORS ª: þ/ª9ù3^ Recall Gauss-SeidelS þ/ª
SORS ª: þ/ª9ù3^ Recall Gauss-SeidelS þ/ª x (k) 1 = 1 a 11 (a 12 x (k 1) 2 + + a 1n x (k 1) n b 1 ) x (k) 2 = 1 a 22 (a 21 x (k) 1 + a 23 x (k 1) 3 + + a 2n x (k 1) n b 2 ) x (k) 3 = 1 a 33 (a 31 x (k) 1 + a 32 x (k) 2 + a 34 x (k 1) 4 + + a 3,n x (k 1) n b 3 ). x n (k) = 1 a nn (a n1 x 1 (k) + a n2 x 2 (k) + + a n,n 1 x n 1 (k) b n )
SORS ª: þ/ª9ù3^ Recall Gauss-SeidelS þ/ª x (k) 1 = 1 a 11 (a 12 x (k 1) 2 + + a 1n x (k 1) n b 1 ) x (k) 2 = 1 a 22 (a 21 x (k) 1 + a 23 x (k 1) 3 + + a 2n x (k 1) n b 2 ) x (k) 3 = 1 a 33 (a 31 x (k) 1 + a 32 x (k) 2 + a 34 x (k 1) 4 + + a 3,n x (k 1) n b 3 ). x n (k) = 1 a nn (a n1 x 1 (k) + a n2 x 2 (k) + + a n,n 1 x n 1 (k) b n ) SORS þ/ª x (k) 1 = x (k 1) 1 ω (a a 11 x (k 1) 11 1 + a 12 x (k 1) 2 + a 13 x (k 1) 3 + + a 1n x (k 1) n b 1 ) x (k) 2 = x (k 1) 2 ω (a a 21 x (k) 1 + a 22 x (k 1) 22 2 + a 23 x (k 1) 3 + + a 2n x (k 1) n b 2 ) x (k) 3 = x (k 1) 3 ω (a a 31 x (k) 1 + a 32 x (k) 2 + a 33 x (k 1) 33 3 + + a 3n x (k 1) n b 3 ). x n (k) = x (k 1) n ω a nn (a n1 x 1 (k) + a n2 x 2 (k) + a n,n 1 x n 1 (k) + a nnx (k 1) n b n)
Theorem (on SOR Method Convergence) In the SOR method, suppose that the splitting matrix Q is chosen to be αd C, where α is a real parameter, D is any positive definite Hermitian matrix, and C is any matrix satisfying C + C = D A. If A is positive definite Hermitian, if Q is nonsingular and α > 1/2, then the SOR iteration converges for any s- tarting vector.
Theorem (on SOR Method Convergence) In the SOR method, suppose that the splitting matrix Q is chosen to be αd C, where α is a real parameter, D is any positive definite Hermitian matrix, and C is any matrix satisfying C + C = D A. If A is positive definite Hermitian, if Q is nonsingular and α > 1/2, then the SOR iteration converges for any s- tarting vector. Remark In the literature, the parameter α is usually denoted by 1/ω. So, The SOR iteration converges when 0 < ω < 2.
S {(
d/ª (assume Q 0) S {( Ax = b Qx = (Q A)x + b x = (I Q 1 A)x + Q 1 b
S {( d/ª (assume Q 0) Ax = b Qx = (Q A)x + b x = (I Q 1 A)x + Q 1 b S úª Qx (k) = (Q A)x (k 1) + b x (k) = (I Q 1 A)x (k 1) + Q 1 b
d/ª (assume Q 0) S {( Ax = b Qx = (Q A)x + b x = (I Q 1 A)x + Q 1 b S úª Qx (k) = (Q A)x (k 1) + b x (k) = (I Q 1 A)x (k 1) + Q 1 b Let A = D + L + U and assume 0 < ω < 2. S { Ý Q S Ý G = I Q 1 A Richardson I I A Jacobi D I D 1 A Gauss-Seidel D + L (D + L) 1 U ) 1 SOR(tµS ) ((1 D + L (D + ω ωl) 1 ω)d ωu
d/ª (assume Q 0) S {( Ax = b Qx = (Q A)x + b x = (I Q 1 A)x + Q 1 b S úª Qx (k) = (Q A)x (k 1) + b x (k) = (I Q 1 A)x (k 1) + Q 1 b Let A = D + L + U and assume 0 < ω < 2. S { Ý Q S Ý G = I Q 1 A Richardson I I A Jacobi D I D 1 A Gauss-Seidel D + L (D + L) 1 U ) 1 SOR(tµS ) ((1 D + L (D + ω ωl) 1 ω)d ωu SSOR( tµs ) (ω(2 ω)) 1 (D + ωl)d 1 (D + ωu) I Q 1 A
Extrapolation ( í)
Extrapolation ( í) Extrapolation is a technique that can be used to improve the convergence properties of a linear iterative process.
Extrapolation ( í) Extrapolation is a technique that can be used to improve the convergence properties of a linear iterative process. Consider the iterative formula x (k) = Gx (k 1) + c (4)
Extrapolation ( í) Extrapolation is a technique that can be used to improve the convergence properties of a linear iterative process. Consider the iterative formula x (k) = Gx (k 1) + c (4) Introduce a parameter γ 0 and embed the above iteration in a one-parameter family of iteration methods given by x (k) = γ(gx (k 1) + c) + (1 γ)x (k 1) = G γ x (k 1) + γc (5) where G γ = γg + (1 γ)i.
Extrapolation If the iteration in (5) converges, say to x, then by taking a limit, we get or x = γ(gx + c) + (1 γ)x x = Gx + c
Extrapolation If the iteration in (5) converges, say to x, then by taking a limit, we get x = γ(gx + c) + (1 γ)x or x = Gx + c Note the iteration in (4) is usually used to produce a sequence to x = Gx + c. If G = I Q 1 A and c = Q 1 b, then it corresponds to solving Ax = b.
Theorem (on Eigenvalues of p(a)) If λ is an eigenvalue of A and if p is a polynomial, then p(λ) is an eigenvalue of p(a).
Theorem (on Eigenvalues of p(a)) If λ is an eigenvalue of A and if p is a polynomial, then p(λ) is an eigenvalue of p(a). Proof. Hint: let Ax = λx with x 0. It s easy to see that A k x = λ k x (k 0) So, p(a)x = m c k A k x = k=0 m c k λ k x = p(λ)x k=0
Remark Suppose we do not know the eigenvalues of G precisely, but we know only an interval, say [a, b], contains all of its eigenvalues.
Remark Suppose we do not know the eigenvalues of G precisely, but we know only an interval, say [a, b], contains all of its eigenvalues. By the theorem, the eigenvalues of G γ = γg + (1 γ)i lie in the interval with endpoints γa + (1 γ) and γb + (1 γ). Denote by Λ(A) the set of eigenvalues of any matrix A. Then ρ(g γ ) = max λ = max γλ + 1 γ λ Λ(G γ ) λ Λ(G) max γλ + 1 γ λ [a,b]
Now, the purpose of extrapolation is
Now, the purpose of extrapolation is min γ ρ(g γ ) = min γ max γλ + 1 γ λ Λ(G)
Now, the purpose of extrapolation is min γ ρ(g γ ) = min γ min γ max λ Λ(G) γλ + 1 γ max γλ + 1 γ λ [a,b]
Now, the purpose of extrapolation is min γ ρ(g γ ) = min γ min γ max λ Λ(G) γλ + 1 γ max γλ + 1 γ < 1 λ [a,b]
Theorem (on Optimal Extrapolation Parameters ) If the only information available about the eigenvalues of G is that they lie in the interval [a, b], and if 1 / [a, b], then the best choice for γ is 2/(2 a b). With this value of γ, ρ(g γ ) 1 γ d, where d is the distance from 1 to [a, b].
Theorem (on Optimal Extrapolation Parameters ) If the only information available about the eigenvalues of G is that they lie in the interval [a, b], and if 1 / [a, b], then the best choice for γ is 2/(2 a b). With this value of γ, ρ(g γ ) 1 γ d, where d is the distance from 1 to [a, b]. Proof. See your lecture notes or P222-223 in the book.
Remark The extrapolation process or technique just discussed can be applied to methods that are not convergent themselves. All that is required is that the eigenvalues of G be real and lie in an interval that does not contain 1.
Example Determine the spectral radius of the optimal extrapolated Richardson method.
Example Determine the spectral radius of the optimal extrapolated Richardson method. Hint. In the Richardson method, Q = I, G = I A.
Example Determine the spectral radius of the optimal extrapolated Richardson method. Hint. In the Richardson method, Q = I, G = I A. Example Determine the spectral radius of the optimal extrapolated Jacobi method.
Chebyshev Acceleration
Chebyshev Acceleration Chebyshev Acceleration is an acceleration procedure that tries to use all available information to get a better approximation of the solution of linear equations. As before, consider a basic iteration method x (k) = Gx (k 1) + c (6)
Chebyshev Acceleration Chebyshev Acceleration is an acceleration procedure that tries to use all available information to get a better approximation of the solution of linear equations. As before, consider a basic iteration method x (k) = Gx (k 1) + c (6) Recall that a solution to the problem is a vector x s.t. x = Gx + c. At STEP k in the process, we shall have computed the vectors x (1), x (2),, x (k),
Chebyshev Acceleration Chebyshev Acceleration is an acceleration procedure that tries to use all available information to get a better approximation of the solution of linear equations. As before, consider a basic iteration method x (k) = Gx (k 1) + c (6) Recall that a solution to the problem is a vector x s.t. x = Gx + c. At STEP k in the process, we shall have computed the vectors x (1), x (2),, x (k), and we ask whether some linear combination of these vectors is perhaps a better approximation to the exact solution than x (k).
Assume that a (k) 0 + a (k) 1 + + a (k) k = 1 and set k u (k) = a (k) i x (i) i=0
Assume that a (k) 0 + a (k) 1 + + a (k) k = 1 and set k u (k) = a (k) i x (i) Then u (k) x = k i=0 i=0 a (k) i (x (i) x) = k i=0 a (k) i G i (x (0) x)
Assume that a (k) 0 + a (k) 1 + + a (k) k = 1 and set k u (k) = a (k) i x (i) Then u (k) x = k i=0 i=0 a (k) i (x (i) x) = = p(g)(x (0) x) k i=0 where p is the polynomial defined by k p(z) = a (k) i z i (p(1) = 1). i=0 a (k) i G i (x (0) x)
Taking norms, we get u (k) x p(g) x (0) x
Taking norms, we get u (k) x p(g) x (0) x If the eigenvalues µ i of G lie within some bounded set S in the complex plane, then by the previous analysis, ρ(p(g)) = max 1 i n p(µ i) max z S p(z)
Taking norms, we get u (k) x p(g) x (0) x If the eigenvalues µ i of G lie within some bounded set S in the complex plane, then by the previous analysis, ρ(p(g)) = Then, this reduces to max p(µ i) max p(z) 1 i n z S min ρ(p(g)) min max p(z) p P k,p(1)=1 p P k,p(1)=1 z S where P k denotes the set of all real polynomials with degree k.
Taking norms, we get u (k) x p(g) x (0) x If the eigenvalues µ i of G lie within some bounded set S in the complex plane, then by the previous analysis, ρ(p(g)) = Then, this reduces to max p(µ i) max p(z) 1 i n z S min ρ(p(g)) min max p(z) p P k,p(1)=1 p P k,p(1)=1 z S where P k denotes the set of all real polynomials with degree k. It s a standard problem in approximation theory.
For example, if S is an interval, say [a, b] R, not containing 1, then a scaled and shifted Shebyshev polynomial can solve this min-max problem.
For example, if S is an interval, say [a, b] R, not containing 1, then a scaled and shifted Shebyshev polynomial can solve this min-max problem. The classic Chebyshev polynomial T k (k 1) is the unique polynomial of degree k with leading coefficient 2 k 1 that minimizes max T k(z) 1 z 1
For example, if S is an interval, say [a, b] R, not containing 1, then a scaled and shifted Shebyshev polynomial can solve this min-max problem. The classic Chebyshev polynomial T k (k 1) is the unique polynomial of degree k with leading coefficient 2 k 1 that minimizes max T k(z) 1 z 1 These polynomials can be generated recursively by { T0 (z) = 1 T 1 (z) = z T k (z) = 2zT k 1 (z) T k 2 (z) (k 2)
Now, suppose that the eigenvalues of G are contained in an interval 1 / [a, b], say b < 1.
Now, suppose that the eigenvalues of G are contained in an interval 1 / [a, b], say b < 1. We are interested in min-max problem: min p k P k,p k (1)=1 max p k(z) z [a,b]
Now, suppose that the eigenvalues of G are contained in an interval 1 / [a, b], say b < 1. We are interested in min-max problem: min p k P k,p k (1)=1 max p k(z) z [a,b] The answer to this problem is contained in the four lemmas in P225-227 of the textbook.