Computational Linear Algebra PD Dr. rer. nat. habil. Ralf Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2017/18
Part 3: Iterative Methods PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 48
overview definitions splitting methods projection and KRYLOV subspace methods multigrid methods PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 49
basic concept we consider linear systems of type Ax b (3.2.1) with regular matrix A and right hand side b Definition 3.17 A projection method for solving (3.2.1) is a technique that computes approximate solutions x m x 0 K m under consideration of (b Ax m ) L m, (3.2.2) where x 0 is arbitrary and K m and L m represent m dimensional subspaces of. Here, orthogonality is defined via the EUCLIDEAN dot product x y (x, y) 2 0. PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 50
basic concept (cont d) observation in case K m L m, the residual vector r m b Ax m is perpendicular to K m we obtain an orthogonal projection method and (3.2.2) is called GALERKIN condition in case K m L m, we obtain a skew projection and (3.2.2) is called PETROV GALERKIN condition splitting methods projection methods computation of approximated solutions x m Rn x m x 0 K m Rn dim K m m n computation method x m Mx m 1 Nb b Ax m L m Rn dim L m m n PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 51
basic concept (cont d) Definition 3.18 A KRYLOV subspace method is a projection method for solving (3.2.1), where K m represents the KRYLOV subspace with r 0 b Ax 0. K m K m (A, r 0 ) span r 0, Ar 0,..., A m 1 r 0 KRYLOV subspace methods are often described as reformulation of a linear system into a minimisation problem well known methods are conjugate gradients (HESTENES & STIEFEL, 1952) and GMRES (SAAD & SCHULTZ, 1986) both methods compute the optimal approximation x m x 0 K m w.r.t. (3.2.2) via incrementing the subspace dimension in every iteration by one neglecting round off errors, both methods would compute the exact solution at latest after n iterations PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 52
method of steepest descent note: throughout this section, we assume the linear system (3.2.1) to exhibit a symmetric and positive definite (SPD) matrix we further consider functions F : x ½(Ax, x) 2 (b, x) 2 (3.2.3) and will first study some of their properties in order to derive the method Lemma 3.19 Let A be symmetric, positive definite and b given, then for a function F defined via (3.2.3) applies iff xˆ arg min F(x) x Axˆ b. PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 53
method of steepest descent (cont d) PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 54
method of steepest descent (cont d) idea: we want to achieve a successive minimisation of F based on point x along particular directions p hence, we define for x, p a function f x, p : f x, p ( ) : F(x p) Lemma and Definition 3.20 Let matrix A be symmetric, positive definite and vectors x, p with p 0 given, hence (r, p) opt opt (x, p) : arg min f x, p ( ) 2 (Ap, p) 2 applies with r : b Ax. Vector r is denoted as residual vector and its EUCLIDEAN norm r 2 as residual. PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 55
method of steepest descent (cont d) PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 56
method of steepest descent (cont d) with given sequence p m m of search directions out of \ 0, we can determine a first method basic solver choose x 0 for m 0, 1,... r m b Ax m m (r m, p m ) 2 (Ap m, p m ) 2 x m 1 x m m p m in order to complete our basic solver, we need a method to compute search directions p m PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 57
Technische Universität München method of steepest descent (cont d) further (w/o loss of generality), we request p m 2 1 for x A 1 b we achieve a globally optimal choice via ˆx x p with xˆ A 1 b, xˆ x 2 as hereby follows for definition of opt according to 3.20 x x opt p x xˆ x 2 xˆ (b Ax, xˆ x) 2 xˆ x (b Ax, xˆ x) 2 xˆ x 2 however, this approach requires the knowledge of the exact solution xˆ for computing search directions PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 58
method of steepest descent (cont d) restricting to local optimality, search directions can be computed with the negative gradient of F here applies, hence F(x) ½(A A T )x b Ax b r p : yields the direction of steepest descent function F is due to 2 F(x) A and SPD matrix A strictly convex it is obvious that xˆ A 1 b due to F(x) ˆ 0 represents the only and global minimum of F PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 59 A sym. r for r 0 r 2 0 for r 0 (3.2.4)
method of steepest descent (cont d) with (3.2.4) we obtain the method of steepest decent (a.k.a. gradient method) choose x 0 for m 0, 1,... r m b Ax m Y r m 0 N m r m 2 2 (Ar m, r m ) 2 m 0 x m 1 x m m r m PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 60
method of steepest descent (cont d) example: consider Ax b with A, b, x 0 thus, we get the following convergence m 0 10 40 70 72 x m,1 method of steepest descent x m,2 m : x m A 1 b A 4.000000e 00 1.341641e 00 7.071068e 00 3.271049e 02 1.097143e 02 5.782453e 02 1.788827e 08 5.999910e 09 3.162230e 08 9.782499e 15 3.281150e 15 1.729318e 14 3.740893e 15 1.254734e 15 6.613026e 15 PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 61
method of steepest descent (cont d) what s happening here? 1 2, 2 10 x 2 x 0 x 2 x 3 x 1 x 1 contour lines of F denote convergence process stretched ellipses due to different large values of diagonal entries of A residual vector always points into the direction of point of origin, but the approximated solution might change its sign in every single iteration motivates further considerations w.r.t. optimality of search directions PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 62
method of steepest descent (cont d) observation: the gradient method represents in every step a projection method with K L span r m 1 obviously, optimality of the approximated solution concerning entire subspace U span r 0, r 1,..., r m 1 would be preferable for linearly independent residual vectors hereby at the latest follows x n A 1 b for the method of steepest descent all approximated solutions x m are optimal concerning r m 1 only due to missing transitivity of condition r p does not (necessarily) follow r m 2 r m from r m 2 r m 1 and r m 1 r m remedy: method of conjugated directions PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 63
method of conjugate directions idea: extend optimality of approximated solution x m to entire subspace U span p 0,..., p m 1 with linearly independent search directions p i the following theorem formulates a condition for search directions that assures optimality w.r.t. U m in the (m 1) st iteration step Theorem 3.21 Let F according to (3.2.3) be given and x be optimal w.r.t. subspace U span p 0,..., p m 1, then x x is optimal w.r.t. U iff applies. A U PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 64
method of conjugate directions (cont d) if for search directions p m either Ap m U m span p 0,..., p m 1 or equivalent Ap m p j, j 0,..., m 1 applies, then the approximated solution x m 1 x m m p m inherits according to 3.21 optimality from x m w.r.t. U m independent from the choice of scalar weighting factor m this degree of freedom m will be used further to extend optimality w.r.t. U m 1 span p 0,..., p m PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 65
method of conjugate directions (cont d) Definition 3.22 Let A, then vectors p 0,..., p m are called pairwise conjugated or A orthogonal if applies. (p i, p j ) A : (Ap i, p j ) 2 0 i, j 0,..., m and i j let pairwise conjugate search directions p 0,..., p m \ 0 be given and x m be optimal w.r.t. U m span p 0,..., p m 1, then we get optimality of w.r.t. U m 1 if x m 1 x m m p m 0 (b Ax m 1, p j ) 2 (b Ax m, p j ) 2 (Ap m, p j ) 2 for j 0,..., m applies 0 for j m 0 for j m PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 66
method of conjugate directions (cont d) for we yield the following representation and, thus, obtain the method of conjugate directions choose x 0 r 0 b Ax 0 for m 0, 1,..., n 1 m (r m, p m ) 2 (Ap m, p m ) 2 x m 1 x m m p m if search directions are chosen inappropriate, x n can yield the exact solution even x n 1 still has a large error in general this method is used as direct method with given search directions only r m 1 r m m Ap m PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 67
CG: method of conjugate gradients combination of methods of steepest descent and conjugate directions in order to obtain a problem oriented approach w.r.t. selection of search directions and optimality w.r.t. orthogonality of search directions with residual vectors r 0,..., r m we successively determine search directions for m 0,..., n 1 according to p 0 r 0 p m r m j p j (3.2.5) for j 0 (j 0,..., m 1) we achieve an analogous selection of search directions according to method of steepest descent hence, under consideration of already used search directions p 0,..., p m 1 \ 0 exist m degrees of freedom in choosing j to assure search directions to be conjugated PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 68
CG: method of conjugate gradients (cont d) from required A orthogonality constraint follows 0 (Ap m, p i ) 2 (Ar m, p i ) 2 j (Ap j, p i ) 2 for i 0,..., m 1 hence, with (Ap j, p i ) 2 0 for i, j 0,..., m 1 and i j we obtain the wanted algorithm to compute coefficients i (3.2.6) (Ar m, p i ) 2 (Ap i, p i ) 2 PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 69
CG: method of conjugate gradients (cont d) thus we obtain the preliminary method of conjugate gradients choose x 0 p 0 r 0 b Ax 0 for m 0, 1,..., n 1 m (r m, p m ) 2 (Ap m, p m ) 2 x m 1 x m m p m r m 1 r m m Ap m p m 1 r m 1 (Ar m 1, p j ) 2 (Ap j, p j ) 2 p j PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 70
CG: method of conjugate gradients (cont d) problem: for computation of p m 1 all p j (j 0,..., m) are necessary due to p m 1 r m 1 (Ar m 1, p j ) 2 p (Ap j (3.2.7) j, p j ) 2 observation a) p m is conjugated to all p j with 0 j m due to (3.2.5) and (3.2.6) b) r m U m span r 0,..., r m 1 span p 0,..., p m 1 c) r m is conjugated to all p j with 0 j m 1 for (c) applies p j U m 1 for 0 j m 1, hence Ap j U m and we get A symm. (b) (Ar m, p j ) 2 (r m, Ap j ) 2 0 PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 71
CG: method of conjugate gradients (cont d) from (c) and (3.2.7) follows (Ar m, p j ) 2 (Ar m, p m 1 ) 2 p m r m p j r m p (Ap m 1 j, p j ) 2 (Ap m 1, p m 1 ) 2 2 further, the method can stop in the k 1 st iteration if p k 0 (or p k 2 0), i.e. the solution x k A 1 b has been found as from r k b Ax k follows x k A 1 b r k 0, substituting r k into above equation for p k yields the wanted result finally, we obtain the method of conjugate gradients PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 72
CG: method of conjugate gradients (cont d) choose x 0 p 0 r 0 b Ax 0 for m 0, 1,..., n 1 Y m (r m, p m ) 2 (Ap m, p m ) 2 x m 1 x m m p m r m 1 r m m Ap m p m 2 2 0 STOP N p m 1 r m 1 (Ar m 1, p m ) 2 (Ap m, p m ) 2 p m PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 73
CG: method of conjugate gradients (cont d) remarks CG can be further improved to a single MVM Ap m per iteration in case of regular matrices, there exist several variants ARNOLDI algorithm LANCZOS algorithm GMRES method (generalized minimal residual) BiCG method (bi conjugate gradient) CGS method (conjugate gradient squared) BiCGSTAB method (BiCG stabilized) TFQMR method (transpose free quasi minimal residual)... not to be discussed here PD Dr. Ralf Peter Mundani Computational Linear Algebra Winter Term 2017/18 74