Computational Linear Algebra PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2018/19
Part 4: Iterative Methods PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 2
overview definitions splitting methods projection and KRYLOV subspace methods multigrid methods PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 3
some considerations let s first consider conventional iterative or relaxation methods, hence let Au = f (4.3.1) denote a linear system derived from a finite-difference discretisation of a one-dimensional second-order boundary value problem u (x) = f(x), 0 < x < 1 with u(0) = u(1) = 0 A = PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 4
some considerations (cont d) let u denote the exact solution of (4.3.1) and v an approximation to the exact solution, then the error (or algebraic error) is given by e = u v the error is a vector and its magnitude measured by the maximum norm or the EUCLEDIAN norm defined as e := max e i and e 2 := i = 1...n unfortunately, the error is just as inaccessible as the exact solution itself however, a computable measure of how well v approximates u is the residual, given by r = f Av PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 5
some considerations (cont d) by uniqueness of the solution, r = 0 if and only if e = 0 from (4.3.1), we can derive the relation between error and residual Ae = r (4.3.2) which is called the residual equation and plays a vital role in multigrid methods by solving (4.3.2) for e we can compute a new approximation using the definition of the error u = v + e for further considerations, it is sufficient to work with the homogeneous linear system Au = 0 using arbitrary initial guesses in order to start a JACOBI relaxation method as the exact solution is known (u = 0), the error in some approximation v is simply v PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 6
some considerations (cont d) the one-dimensional model problem with f = 0 appears as u 0 u 1 u 2 u n 1 u n h u i 1 + 2u i u i+1 = 0, 1 i n 1 u 0 = u n = 0, with grid points x i = ih, where h = 1/n to obtain some valuable insight, we apply various iterations to this system with an initial guess consisting of the vectors (or FOURIER modes) m i = sin, 0 i n, 1 k n 1, where integer k is called wave number (or frequency) indicating the number of half sine waves that constitute m on the problem domain PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 7
some considerations (cont d) let m k denote the entire vector m with wave number k, shown below are initial guesses m 1, m 3, and m 6 k = 1 k = 3 k = 6 small values of k belong to long, smooth waves while large values of k correspond to highly oscillatory waves PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 8
some considerations (cont d) now applying a JACOBI relaxation method with ω = 2/3 to the model problem on a grid with n = 64 points for initial guesses m 1, m 3, and m 6 the iteration is applied 100 times 1,00 k = 1 0,75 error 0,50 k = 3 0,25 0,00 0 25 50 75 100 k = 6 recall that the error is just m iterations PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 9
some considerations (cont d) plotting log of e for the JACOBI relaxation method with ω = 2/3 shows a linear decrease in the log of the error norm, indicating that the error decreases geometrically with each iteration 1,00 k = 1 k = 3 log error 0,10 k = 6 0,01 0 25 50 75 100 iterations PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 10
some considerations (cont d) in general, most initial guesses would not consist of a single mode hence, a more realistic situation is given by an overlay of two modes: a low-frequency wave (k = 2) and a high frequency wave (k = 16) due to m i = on a grid with n = 64 points 1,00 0,75 error 0,50 0,25 (m 2 + m 16 )/2 0,00 0 25 50 75 100 iterations PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 11
some considerations (cont d) live demo playing with coins source: wn.de PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 12
analytical approach discussed iteration method can be represented in the (simplified) form v 1 = Rv 0 + g (4.3.3) where R denotes the iteration matrix furthermore, the exact solution u is a fixed-point of (4.3.3), that means u = Ru + g (4.3.4) subtracting (4.3.3) from (4.3.4) yields e 1 = Re 0 repeating this argument, it follows that after m relaxation steps the error in m-th approximation is given by e m = R m e 0 PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 13
analytical approach (cont d) if we choose a particular vector norm (and its associated matrix norm), it is possible to bound the error after m iterations by e m R m e 0 as for relaxation methods we assume ρ(r) < 1, the error is forced to zero as the iteration proceeds considering the weighted JACOBI iteration applied to the one-dimensional model problem, we have R ω = (1 ω)i + ω R J with R ω = I PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 14
analytical approach (cont d) written in this form follows that eigenvalues of R ω and A are related by σ (R ω ) = 1 ω 2 σ (A), with σ (A) ={4 sin 2 }, 1 k n 1 let w k,j denote the j-th component of k-th eigenvector w k, then with eigenvectors of A given by w k,j = sin, 1 k n 1, 0 j n, we see that eigenvectors of A are simply the FOURIER modes discussed earlier in this section PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 15
analytical approach (cont d) with these results, we find that eigenvalues of R ω are λ k (R ω ) = 1 2ω sin 2, 1 k n 1, while eigenvectors of R ω are the same as eigenvectors of A it is important to note that only for 0 <ω 1 follows λ k (Rω) <1 and the weighted JACOBI method converges let e 0 be the error in an initial guess used in the weighted JACOBI method, then it is possible to expand e 0 using the eigenvectors of A in the form e 0 = c k w k, where coefficients c k error denote a weighting factor for each mode in the PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 16
analytical approach (cont d) we have seen that after m iterations, the error is given by e m = Rm ω e 0 using the eigenvector expansion for e 0, we have R ω w k = λ k (R ω )w k m e m = R ω e 0 = c k Rm m ω w k = c k λ k (R ω )w k this expansion for e m shows that after m iterations, the k-th mode of the m initial error has been reduced by a factor of λ k (R ω ) it should further be noted that the weighted JACOBI method does not mix modes, i.e. amplitudes of single modes can change, but they cannot be converted into different modes we would like to the find the optimal value of ω with 0 < ω 1 that makes λ k (R ω ) as small as possible for all 1 k n 1 PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 17
analytical approach (cont d) eigenvalues λ k = 1 2ω sin 2 of iteration matrix R ω for ω = ⅓, ½, ⅔, 1 λ k (R ω ) 1 ω = 1/3 ω = 1/2 n/2 n k ω = 2/3 1 ω = 1 PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 18
analytical approach (cont d) for all values of ω satisfying 0 < ω 1 applies λ 1 = 1 2ω sin 2 = 1 2ω sin 2 1 this implies that λ 1, the eigenvalue associated with the smoothest mode, will always be close to 1, hence no value of ω will reduce the smooth components of the error effectively even worse, the smaller the grid spacing h, the closer λ 1 is to 1, thus deteriorating convergence as no value of ω damps the smooth components satisfactorily, we are looking for values that provide the best damping of the oscillatory components (those with n/2 k n 1) solving condition λ n/2 (R ω ) = λ n (R ω ) leads to the optimal value of ω = ⅔ PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 19
analytical approach (cont d) again, we consider the weighted JACOBI method applied to the onedimensional model problem Au = 0 on a grid with n = 64 points for initial guesses consisting of single modes m k (1 k 63) only, the figures show the number of iterations required to reduce the norm of the initial error by a factor of 100 for weighting factors of ω = 1 and ω = ⅔ iterations ω=1 iterations ω=⅔ wave number k wave number k PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 20
analytical approach (cont d) another perspective of this convergence behaviour is provided below plotted are approximations for different initial guesses consisting of m 3, m 16, and (m 2 + m 16 )/2 shown after one and after 10 relaxation steps initial guess m 3 initial guess m 16 initial guess (m 2 + m 16 )/2 after 10 relax. steps after 10 relax. steps after 10 relax. steps PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 21
elements of multigrid how do smooth components look like on coarser grids? consider some fine (Ω h ) and coarse (Ω 2h ) grid with double grid spacing given some smooth wave with k = 4 on Ω h with n = 12 points Ω 2h representation with n = 6 points via direct projection wave becomes oscillatory on Ω 2h PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 22
elements of multigrid (cont d) consequence: passing from the fine to the coarse grid, a mode becomes more oscillatory this is true provided that 1 k < n/2; the k = n/2 mode on Ω h becomes the zero vector on Ω 2h idea: when relaxation begins to stall, signalling the predominance of smooth error modes, move to a coarser grid as smooth error modes appear oscillatory there this leads to the following strategy, a procedure which is the basis of what is called the correction scheme relax on Au = f on Ω h to obtain an approximation v h compute residual r = f Av h relax on Ae = r on Ω 2h to obtain an approximation to the error e 2h correct v h v h + e 2h on Ω h with error estimate e 2h obtained on Ω 2h PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 23
elements of multigrid (cont d) question: how to transfer residual r h from Ω h to Ω 2h (called restriction) and how to transfer the error estimate e 2h back from Ω 2h to Ω h (called interpolation or prolongation)? for further discussions, we consider only cases in which the coarse grid has twice the grid spacing of the next finest one ( universal practice, as there is no advantage in using grid spacings with ratios other than 2) the simplest prolongation method is quite effective and, thus, used for most multigrid purposes here the linear prolongation operator takes coarse-grid vectors and produces fine-grid vectors according to v 2h = v h, where, 0 j n/2 1 PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 24
elements of multigrid (cont d) Ω h Ω 2h for n = 7 the linear prolongation has the form PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 25
elements of multigrid (cont d) quality of the prolongation (I) assume, the real error is a smooth vector on the fine grid Ω h and the coarse-grid approximation to the error obtained on Ω 2h is exact at the coarse grid points hence, when prolongated to the fine grid, the interpolant is also smooth if the exact error on Ω h (indicated by and ) is smooth, an interpolant of the coarse-grid error e 2h (solid line connecting points) should give a good representation of the exact error PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 26
elements of multigrid (cont d) quality of the prolongation (II) by contrast, if the real error is oscillatory on the fine grid Ω h, even a very good coarse-grid approximation may produce an interpolant that is not very accurate thus, interpolation is most effective when the error is smooth (in contrast to relaxation, which is most effective when the error is oscillatory) if the exact error on Ω h (indicated by and ) is oscillatory, an interpolant of the coarse-grid error e 2h (solid line connecting points) may give a poor representation of the exact error PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 27
elements of multigrid (cont d) second class of intergrid transfer operations restriction operator involves moving vectors from finer to coarser grids the most obvious restriction operator is injection, which is defined by v h = v 2h, where Ω h Ω 2h trivial restriction called injection PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 28
elements of multigrid (cont d) for n = 7, the restriction (injection) has the form an alternative restriction operator, called full weighting, is defined by, 1 j n/2 1 PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 29
elements of multigrid (cont d) Ω h Ω 2h for n = 7, the restriction (full weighting) has the form PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 30
variational properties one reason for choosing full weighting as restriction operator is the fact, c, called variational property when expressing the coarse grid problem, A 2h v 2h = f 2h, we assume that A 2h is the Ω 2h version of the original operator A h, i.e. the result of discretising the problem on Ω 2h question: how to generate A 2h...? let us assume that the error e h = u h v h of some approximation v h lies entirely in the range of interpolation, denoted as this means that for some vector v 2h Ω 2h, e h = v 2h therefore, the residual equation on Ω h may be written A h e h = A h v 2h = r h (4.3.5) PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 31
variational properties (cont d) in (4.3.5), A h acts on a vector that lies entirely in the range of interpolation but how does A h act on let v 2h be some arbitrary vector on Ω 2h v 2h Ω 2h Ω h v 2h A h v 2h Ω h 0 1 2 3 4 5 6 7 8 PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 32
variational properties (cont d) we may conclude, odd rows of A h in (4.3.5) are zero and even rows correspond to the coarse-grid points of Ω 2h therefore, we can find a coarse-grid form of the residual equation by dropping the odd rows of (4.3.5) formally done by applying the restriction operator to both sides of (4.3.5) hence the residual equations becomes A h v 2h = r h this gives a plausible definition for the coarse-grid operator: A 2h = A h the terms of A 2h may be computed explicitly when applying A h term by term to the j-th unit vector on Ω 2h this establishes that the j-th column of A 2h and, by symmetry, also the j-th row of A 2h are given by 1 (2h) 2 ( 1 2 1) PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 33
variational properties (cont d) j 1 j j+1 0 1 0 0 ½ 1 ½ 0 1 2h 2 1 4h 2 0 1 1 0 h 2 2h 2 1 2h 2 1 4h 2 calculation of j-th row of A 2h = A h hence, we would get the same result if the original problem were simply discretised on Ω 2h using second-order finite differences therefore, by this definition, A 2h really is the Ω 2h version of A h PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 34
two-grid correction scheme now using well-defined ways to transfer vectors between grids with previous correction scheme we define a two-grid method; parameters ν 1, ν 2 controlling number of relaxation steps in practice often 1, 2, or 3 relax ν 1 times on A h v h = f h on Ω h with initial guess v h compute residual r h = f h A h v h restrict residual r h to coarse grid by r 2h = r h solve A 2h e 2h = r 2h on Ω 2h prolongate coarse-grid error e 2h to fine grid by e h = e 2h correct fine-grid approximation v h v h + e h relax ν 2 times on A h v h = f h on Ω h with corrected approximation v h PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 35
two-grid correction scheme (cont d) example: we consider the weighted JACOBI method with ω = ⅔ applied to our one-dimensional model problem Au = 0 on a grid with n = 64 points as initial guess (m 16 + m 40 )/2 is used, consisting of modes k = 16 and k = 40 the following two-grid correction scheme is applied relax three times on A h v h = 0 on Ω h with initial guess (m 16 + m 40 )/2 compute r 2h = r h relax three times on A 2h e 2h = r 2h on Ω 2h with initial guess e 2h = 0 correct fine-grid approximation v h v h + e 2h relax three times on A h v h = 0 on Ω h PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 36
two-grid correction scheme (cont d) initial guess (m 16 + m 40 )/2 after one relaxation step after three relaxation steps after coarse-grid correction after one full 2-grid cycle after two full 2-grid cycles PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 37
V-cycle scheme question: best way to solve the coarse-grid problem A 2h e 2h = r 2h notice: coarse-grid problem not much different from original problem therefore, we can apply two-grid scheme to residual equation on Ω 2h, which means relaxing there (on Ω 2h ) and then moving forward to Ω 4h for the correction step recursive applications leads to successively coarser grids until a direct solution of the residual equation is possible simplified notation: for m > 1 grids with grid spacing l {2h, 4h,..., 2 m 1 h} RHS vector of residual equation on Ω l is also called f l instead of r l approximations on Ω l are also called v l instead of e l initial guesses on first visit to Ω l are chosen as v l = 0 PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 38
V-cycle scheme (cont d) idea of recursive algorithm with ν 1 pre- and ν 2 post-smoothing steps Ω h Ω 2h Ω 4h relax ν 1 times A h u h = f h f 2h = r h relax ν 1 times A 2h v 2h = f 2h f 4h = r 2h relax ν 1 times A 4h v 4h = f 4h relax ν 2 times A 4h e 4h = f 4h relax ν 2 times A h u h = f h relax ν 2 times A 2h e 2h = f 2h e 2h e 2h + u h u h + e 4h e 2h f 8h = r 4h e 4h e 4h + e 8h Ω 8h solve e 8h = (A 8h ) 1 f 8h PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 39
V-cycle scheme (cont d) Ω h Ω 2h v l MG V (v l, f l ) 1. relax ν 1 times on A l v l = f l with initial guess v l 2. if Ω l = coarsest grid, then go to step 4 else f 2l (f l A l v l ) v 2l 0 v 2l MG V (v 2l, f 2l ) Ω 4h Ω 8h 3. correct v l v l + v 2l 4. relax ν 2 times on A l v l = f l PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 40
μ-cycle scheme Ω h Ω 2h Ω 4h v l MG μ (v l, f l ) Ω 8h W-cycle (μ = 2) 1. relax ν 1 times on A l v l = f l with initial guess v l 2. if Ω l = coarsest grid, then go to step 4 else f 2l (f l A l v l ) v 2l 0 v 2l MG μ (v 2l, f 2l ) μ times 3. correct v l v l + v 2l In practice, only μ = 1 (which gives the V-cycle) and μ = 2 (which gives the W-cycle) are used. 4. relax ν 2 times on A l v l = f l PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 41
full multigrid V-cycle Ω h Ω 2h Ω 4h Ω 8h full multigrid with ν 0 = 1 v l FMG( f l ) 1. if Ω l = coarsest grid, set v l 0 and go to step 3 else f 2l (f l ) v 2l FMG( f 2l ) 2. correct v l v 2l 3. v l MG V (v l, f l ) ν 0 times Here, the idea is to use coarse grids in order to obtain better initial guesses, a strategy called nested iteration. PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 42
complexity storage: consider d-dimensional grid with n d grid points on Ω h on each coarser level, number of grid points decreases by a factor of (½) d i.e. Ω 2h = (½) d Ω h, Ω 4h = (½) d Ω 2h = (¼) d Ω h,... hence, to store all grids Ω h, Ω 2h, Ω 4h,... there is memory for grid points necessary using the sum of the geometric series as an upper bound gives a good estimation of memory requirements, namely storage < n d 1 2 d PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 43
computational cost it is convenient to measure the computational cost in terms of work units (WU), which is the cost performing one relaxation sweep on the finest grid Ω h neglecting the cost of restriction and prolongation operations, which typically amounts to 10 20% of the cost of the entire cycle consider a V-cycle with one relaxation sweep on each level (i.e. ν 1 = ν 2 = 1) each level is visited twice and on each coarser level work units decrease by a factor of (½) d adding these costs and again using the geometric series for an upper bound gives the V-cycle computation cost as PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 44
example let Au = 0 denote a linear system derived from a finite-difference discretisation of a one-dimensional second-order boundary value problem u (x) = 0, 0 < x < 1 u(0) = 0 u(1) = 1 with grid spacing h = 2 14, thus we have n = 1/h = 2 14 = 16,384 grid points matrix A is SPD and diagonally dominant, hence the JACOBI method will converge for arbitrary start vector v 0 as initial guess we choose v 0 = v h = 0 for a multigrid solver (V-cycle) PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 45
example (cont d) convergence plot of (standard) JACOBI method PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 46
example (cont d) convergence plot of multigrid method (V-cycle, 11 levels, ν 1 = ν 2 = 3) PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 47
example (cont d) comparison JACOBI, multigrid, CG PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 48
overview definitions splitting methods projection and KRYLOV subspace methods multigrid methods PD Dr. Ralf-Peter Mundani Computational Linear Algebra Winter Term 2018/19 49