Lab 1: Iterative Methods for Solving Linear Systems

Similar documents
Chapter 7 Iterative Techniques in Matrix Algebra

The Conjugate Gradient Method

Course Notes: Week 1

Iterative Methods for Solving A x = b

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

Lecture 18 Classical Iterative Methods

Numerical Methods I Non-Square and Sparse Linear Systems

Krylov Subspaces. Lab 1. The Arnoldi Iteration

Parallel Numerics, WT 2016/ Iterative Methods for Sparse Linear Systems of Equations. page 1 of 1

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

Chapter 7. Iterative methods for large sparse linear systems. 7.1 Sparse matrix algebra. Large sparse matrices

9.1 Preconditioned Krylov Subspace Methods

Iterative methods for Linear System

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Introduction to Scientific Computing

Krylov Subspaces. The order-n Krylov subspace of A generated by x is

Simple iteration procedure

1 Extrapolation: A Hint of Things to Come

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

JACOBI S ITERATION METHOD

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Iterative Methods. Splitting Methods

Linear Solvers. Andrew Hazel

6.4 Krylov Subspaces and Conjugate Gradients

CAAM 454/554: Stationary Iterative Methods

9. Iterative Methods for Large Linear Systems

6. Iterative Methods for Linear Systems. The stepwise approach to the solution...

4.6 Iterative Solvers for Linear Systems

Stabilization and Acceleration of Algebraic Multigrid Method

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

Krylov Subspace Methods that Are Based on the Minimization of the Residual

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)

FEM and sparse linear system solving

Lecture 17: Iterative Methods and Sparse Linear Algebra

ITERATIVE METHODS BASED ON KRYLOV SUBSPACES

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Numerical Analysis Lecture Notes

Iterative Methods for Sparse Linear Systems

Classical iterative methods for linear systems

Iterative Solvers. Lab 6. Iterative Methods

Algebraic Multigrid as Solvers and as Preconditioner

Introduction to Iterative Solvers of Linear Systems

Chapter 2. Solving Systems of Equations. 2.1 Gaussian elimination

Notes on Some Methods for Solving Linear Systems

Math 471 (Numerical methods) Chapter 3 (second half). System of equations

Eigenvalue problems III: Advanced Numerical Methods

From Stationary Methods to Krylov Subspaces

Lecture 11. Fast Linear Solvers: Iterative Methods. J. Chaudhry. Department of Mathematics and Statistics University of New Mexico

Jordan Journal of Mathematics and Statistics (JJMS) 5(3), 2012, pp A NEW ITERATIVE METHOD FOR SOLVING LINEAR SYSTEMS OF EQUATIONS

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Motivation: Sparse matrices and numerical PDE's

Numerical Methods - Numerical Linear Algebra

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.

CLASSICAL ITERATIVE METHODS

The Solution of Linear Systems AX = B

Lecture 8: Fast Linear Solvers (Part 7)

Lecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University

Preface to the Second Edition. Preface to the First Edition

Krylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.

Numerical Methods in Matrix Computations

Solving Sparse Linear Systems: Iterative methods

Solving Sparse Linear Systems: Iterative methods

Iterative Methods and Multigrid

Physics 202 Laboratory 5. Linear Algebra 1. Laboratory 5. Physics 202 Laboratory

A Comparison of Solving the Poisson Equation Using Several Numerical Methods in Matlab and Octave on the Cluster maya

Review of matrices. Let m, n IN. A rectangle of numbers written like A =

LINEAR ALGEBRA: NUMERICAL METHODS. Version: August 12,

Chapter 9 Implicit integration, incompressible flows

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition

Notes for CS542G (Iterative Solvers for Linear Systems)

7.2 Steepest Descent and Preconditioning

Conjugate Gradient Method

Computational Linear Algebra

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

Approximating the matrix exponential of an advection-diffusion operator using the incomplete orthogonalization method

Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

Linear Algebra. Brigitte Bidégaray-Fesquet. MSIAM, September Univ. Grenoble Alpes, Laboratoire Jean Kuntzmann, Grenoble.

The Lanczos and conjugate gradient algorithms

Lecture 9 Approximations of Laplace s Equation, Finite Element Method. Mathématiques appliquées (MATH0504-1) B. Dewals, C.

Fine-Grained Parallel Algorithms for Incomplete Factorization Preconditioning

Orthogonal Transformations

Preconditioning Techniques Analysis for CG Method

AMS526: Numerical Analysis I (Numerical Linear Algebra)

In order to solve the linear system KL M N when K is nonsymmetric, we can solve the equivalent system

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

Iterative techniques in matrix algebra

Lecture 10b: iterative and direct linear solvers

Sparse Linear Systems. Iterative Methods for Sparse Linear Systems. Motivation for Studying Sparse Linear Systems. Partial Differential Equations

Numerical Solution Techniques in Mechanical and Aerospace Engineering

Conjugate-Gradient. Learn about the Conjugate-Gradient Algorithm and its Uses. Descent Algorithms and the Conjugate-Gradient Method. Qx = b.

M.A. Botchev. September 5, 2014

EECS 275 Matrix Computation

Next topics: Solving systems of linear equations

FEM and Sparse Linear System Solving

MAA507, Power method, QR-method and sparse matrix representation.

Transcription:

Lab 1: Iterative Methods for Solving Linear Systems January 22, 2017 Introduction Many real world applications require the solution to very large and sparse linear systems where direct methods such as Gaussian elimination are prohibitively expensive both in terms of computational cost and in available memory. In this Lab, you will learn how to implement the Jacobi, Gauss-Seidel, Conjugate Gradient (CG) (and optionally Preconditioned Conjugate Gradient (PCG) and the General Minimal RESidual (GMRES)) algorithms to solve linear systems and investigate the associated convergence properties of these algorithms. 1 Problem 1 1.1 Problem statement Consider the following advection-diffusion equation in 1D with the following boundary conditions. ɛu + βu = 0 u(0) = 0, u(1) = 1 (1) where ɛ is the diffusion coefficient and β is the velocity at which the quantity is flowing. Equation (1) can be discretized and written in matrix form as, Ax = b We have provided a function that discretizes these equations and returns the matrices for you. An example call to this function is shown below. import numpy as np N = 60 beta = 1; epsilon = 1e-2; A,b = generate_matrices(epsilon,beta,n) Note that we have included the header import numpy as np. Numpy will be the package we will use for numerical linear algebra in Python. N is the number of intervals we use in the discretization - we are essentially dividing the interval [0, 1] into N intervals each of size h = 1/N. The larger N, the more accurate the solution. Fix N = 60 for now. To solve the linear system directly using a Python function type, x = np.linalg.solve(a,b) 1

The analytic solution is given by : u ex = exp ( β ɛ x) 1 exp ( β ɛ ) 1 You can compare your solution to the exact solution. To do this, and to view the accuracy of the computed solution type, visualize_solution(x) Exercise: Try experimenting with different values of N and observe what happens. You will see that the larger N is, the smaller the discretization error and the better the match with the exact solution. 1.2 Jacobi method We start by implementing the Jacobi method for solving linear systems. The Jacobi algorithm and the Gauss-Seidel algorithm (that you will later implement) belong to a class of iterations to solve linear systems called stationary iterations. They are named as such because the solution of the linear system is expressed as finding the stationary point of some fixed-point iteration. The matrix A, a 11 a 12 a 1n a 21 a 22 a 2n A =...... a n1 a n2 a nn can be split into A = D + R where D is its diagonal component and R is the remaining matrix after subtracting the diagonal, R = A D, a 11 0 0 0 a 12 a 1n 0 a 22 0 D =......, R = a 21 0 a 2n...... 0 0 a nn a n1 a n2 0 The solution to the linear system by Jacobi method is then obtained iteratively by: ( x k+1 = D 1 (b Rx k ) or x k+1 i = 1 b i ) a ij x k j a ii j i A basic implementation of the Jacobi method is given below. [m,n] = np.shape(a) x_jacobi = np.zeros(np.shape(x)); maxit = 20; nm = np.zeros((maxit,1)) D = np.diag(np.diag(a)) R = A - D Dinvb = np.linalg.solve(d,b) for k in np.arange(maxit): x_jacobi = Dinvb - np.linalg.solve(d,np.dot(r,x_jacobi) ) nm = np.linalg.norm(np.dot(a,x_jacobi)-b) 2

Figure 1: (left) Plot of the residual norm vs. the iteration count for several values of N. (right) Semilog plot of the residual norm vs. iteration count for several values of N. Again, if you want to visualize the solution compared to the analytic solution, just call the visualize solution function. visualize_solution(x_jacobi) To see how the method is converging, plot the norm of the residual Ax b 2 as a function of the iteration number. Note that we have stored the residual norm in a vector called nm. import matplotlib.pyplot as plt plt.plot(nm) In Figure 1 we plot the behavior of the norm vs. the iteration count for several values of N. The plot to the right is on a semilog plot, so it is clear to see that the convergence rate is slower for N larger. This is because the spectral radius ρ(d 1 R) gets larger as N gets larger. Recall the definition of spectral radius, ρ(a) = max{ λ 1, λ 2,, λ n } where λ i denotes the i-th eigenvalue of A. We have plotted in Figure 2 how the spectral radius varies with N. The spectral radius is the factor that controls convergence. For stationary methods to converge, ρ < 1 and the smaller ρ is, the faster the convergence rate. Exercise: Calculate the spectral radius for the varying values of N. T = np.linalg.solve(-d,r) eigst, _ = np.linalg.eig(t) rho = np.max(np.abs(eigst)) Another interesting thing that can be shown (and is a nice linear algebra exercise if you are interested) is that when N β/2ɛ the matrix is diagonally dominant, a highly desirable property and a sufficient (but not necessary) condition for the Jacobi method to converge. Recall that a matrix A is diagonally dominant if a ii j i a ij for all i Also recall that h = 1/N so the condition for diagonal dominance can also be written in terms of the grid resolution as h 2ɛ/β. 3

Figure 2: Spectral radius as a function of the grid size. Note that as the grid size gets larger, the spectral radius goes to 1 and convergence of Jacobi method will thus be slower. Exercise: Run the Jacobi algorithm with an N < β/2ɛ and you will observe that the Jacobi method will probably not converge. 2 Problem 2 2.1 Problem statement Let us a consider a slightly more complicated problem. We consider a 2D Poisson problem on a square [ 1, 1] [ 1, 1], 2 u = f with boundary conditions that vanish and with a source term, f(x) = 1 if x < 0.5 and y < 0.5 A visualization of the source field and the exact solution is shown in Figure 3. Note the difference in scales. Figure 3: (left) Source term for the 2D Poisson problem. (right) The solution of the 2D Poisson problem. N = 40 A, b = generate_matrices_2d(n) The code above will generate a system matrix A of size (N + 1) (N + 1), corresponding to the number of nodes in the grid. 4

2.2 Jacobi method Exercise: Run the Jacobi algorithm you implemented in Question 1 on this problem. Set maxit= 20. You will observe that even after 20 iterations, the method is far from convergence. Again, pay close attention to the scales. Note that the convergence rate is very slow. This is because the spectral radius in this case is 0.997. To visualize the solution after 20 iterations as shown in Figure 4,call the visualize solution 2d function. visualize_solution_2d(x_jacobi) Figure 4: Solution of the 2D Poisson problem after 20 steps of the Jacobi method. 2.3 Gauss-Seidel method The next algorithm we will implement is Gauss-Seidel. Gauss-Seidel is another example of a stationary iteration. The idea is similar to Jacobi but here, we consider a different splitting of the matrix A. The matrix A, a 11 a 12 a 1n a 21 a 22 a 2n A =...... a n1 a n2 a nn can be split into A = L + U where L is the lower triangular part of A and U is the strictly upper triangular part of A (without the diagonal), a 11 0 0 0 a 12 a 1n a 12 a 22 0 L =......, U = 0 0 a 2n...... a 1n a 2n a nn 0 0 0 The solution to the linear system by the Gauss-Seidel algorithm is then obtained iteratively by: ( ) x k+1 = L 1 (b Ux k ) or x k+1 i = 1 i 1 n b i a ij x k+1 j a ij x k j a ii j=1 j=i+1 Note that ρ gs (L 1 U) < ρ jacobi (D 1 R) for the class of matrices that converge with these algorithms. See the convergence plot in Figure 6. 5

Exercise: Program the Gauss-Seidel method and run it for 20 iterations. Hint: To extract the lower triangular part of the matrix, use np.tril(a), and visualize the solution. You should get something that matches the figure shown in Figure 5. Exercise: Gauss-Seidel has favorable convergence properties over Jacobi, but it also has the additional advantage that it requires less storage. Why does Gauss-Seidel require less storage than Jacobi? Figure 5: Solution of the 2D Poisson problem after 20 steps of the Gauss-Seidel method. Figure 6: (left) Plot of the residual norm vs. the iteration count. (right) Semilog plot of the residual norm vs. iteration count. Exercise: The Jacobi algorithm has recently gained popularity, even with its slow convergence. This is because it is very easy to parallelize. Why? 2.4 Conjugate Gradient Another class of iterative algorithms that are used to solve linear systems are Krylov algorithms. Krylov subspace methods are considered as one of the ten most important classes of numerical methods (To read about its history and about the other 9 algorithms, refer to https://www.siam.org/pdf/news/637.pdf). Krylov subspace methods are based on successive multiplications of the matrix A with the right-hand-side vector b, and in fact, this allows A to be given as an operator. This eliminates the need to store every entry in the matrix. Additionally, since A is sparse, these multiplications of A with b are generally cheap. Conjugate gradient (CG) is one of the most widely used methods for solving sparse linear systems, with A being a symmetric positive definite matrix. 6

x_cg = np.zeros(np.shape(x)); rk = b-np.dot(a,x_cg) p = rk for k in np.arange(maxit): alpha = np.dot(rk,rk)/np.dot(p,np.dot(a,p)) x_cg = x_cg + alpha*p rkp1 = rk - alpha*np.dot(a,p) beta = np.dot(rkp1,rkp1)/np.dot(rk,rk) p = rkp1 + beta*p rk = rkp1 Figure 7: Solution of the 2D Poisson problem after 20 steps of the Conjugate Gradient algorithm. In Figure 7 we can see that even with 20 iterations, the computed solution very well matches the exact solution. In fact, the convergence plot in Figure 8 shows how the residual norm behaves with conjugate gradient. Exercise: Note that the residual norm does not monotically decrease for CG. Why? What is the norm that Conjugate Gradient is minimizing? Figure 8: (left) Plot of the residual norm vs. the iteration count. (right) Semilog plot of the residual norm vs. iteration count. 2.5 Preconditioned Conjugate Gradient (PCG) (Optional) In many cases, the linear systems are very ill-conditioned and CG may be slow to converge, or even not converge at all. In fact, the number of iterations required for the conjugate 7

gradient algorithm to converge is proportional to the square root of the condition number of A. Preconditioning is often necessary to ensure convergence of CG. A standard method for preconditioning is to multiply the matrix equation by a matrix M such that M 1 Ax = M 1 b Note that this system is equivalent to the original system. To improve convergence of CG, the preconditioner M needs to be chosen so that M 1 A is better conditioned than A. In the case of conjugate gradient, M needs to be positive definite. In the code below, we show the conjugate gradient code with Jacobi preconditioning (M = diag(a)). D = np.diag(np.diag(a)) x_pcg = np.zeros(np.shape(x)); rk = b-np.dot(a,x_pcg) zk = np.linalg.solve(d,rk) p = zk for k in np.arange(maxit): alpha = np.dot(rk,zk)/np.dot(p,np.dot(a,p)) x_pcg = x_pcg + alpha*p rkp1 = rk - alpha*np.dot(a,p) zkp1 = np.linalg.solve(d,rkp1) beta = np.dot(zkp1,rkp1)/np.dot(zk,rk) p = zkp1 + beta*p rk = rkp1 zk = zkp1 Consider the example where A is a matrix with entries a ii = 0.2 + i along the diagonals, and 1 on the subdiagonal, superdiagonal, and along the 200th subdiagonal and 200th superdiagonal (i.e. a ij = 1 when i j = 1 and i j = 200). We have provided a function to return this matrix for you. [A,b] = pcg_example() Observe the behaviour of the residual norm for CG and PCG for this matrix. 2.6 GMRES (Optional) For non-symmetric matrices, one of the most popular iterative methods is the Generalized Minimal RESidual (or, GMRES). Similar to CG, it is a Krylov solver. Recall the definition of a Krylov subspace K m (A, b), K k (A, b) = span{b, Ab, A 2,, A k 1 } The idea behind GMRES is that at every iteration, GMRES computes the solution estimate x k that minimizes the residual Euclidean norm over a Krylov subspace of dimension k. To generate this subspace, the Arnoldi method is used. The Arnoldi iteration uses a modified Gram-Schmidt method to get the orthonormal vectors that span the Krylov subspace. The Arnoldi algorithm is summarized in Algorithm 1. Note that after k steps of the Arnoldi algorithm, we have generated an upper Hessenberg matrix H and a matrix Q whose columns are the orthonormal vectors spanning the Krylov subspace. These matrices satisfy the following relation, AQ k = Q k+1 Hk 8

Algorithm 1 Arnoldi using Modified Gram-Schmidt 1: Given A, b the right hand side. 2: Compute q 1 = b/β and β def = b 2 3: for all j = 1,..., k do 4: Compute v = Aq j 5: for all i = 1,..., j do 6: h ij := q i v 7: Compute v := v h ij q i 8: end for 9: h j+1,j := v 2. If h j+1,j = 0 stop 10: v j+1 = v/h i+1,i 11: end for Algorithm 2 GMRES 1: Given A, b the right hand side. 2: Compute q 1 = b/β and β def = b 2 3: for all j = 1,..., k do 4: Perform j steps of the Arnoldi algorithm as described in Algorithm 1 5: Find y that minimizes H j βe 1 2 where e 1 is the first vector of the standard basis R j+1 (i.e. e 1 = (1, 0,, 0) T ) Set x j = Q j y 6: end for The GMRES algorithm is then described in Algorithm 2. In step 5, you can either solve the least squares problem using QR or use the numpy function, numpy.linalg.lstsq( H j, βe 1 ). Exercise: Where in the code are the Krylov vectors computed? 9