Sparse Linear Systems Iterative Methods for Sparse Linear Systems Matrix Computations and Applications, Lecture C11 Fredrik Bengzon, Robert Söderlund We consider the problem of solving the linear system of equations Ax = b where A is a given n n matrix, b is a given n 1 vector, and x the sought n 1 solution vector. Department of Mathematics, Umeå University November 11, 2008 Our basic assumption is that n is large (e.g., n > 10 3 ) and that A is sparse. A sparse matrix is somewhat vaguely defined as one with very few non-zero entries a i,j. Motivation for Studying Sparse Linear Systems Partial Differential Equations PDE s are important since they provide mathematical models of many real-world phenomenons, such as: Elastic deformation - Cauchy-Navier s equations Discretization of partial differential equations (PDE s) constitute by far the biggest source of sparse matrix problems. σ = f σ = 2με(u) + λ( u)i Fluid flow - the Navier-Stokes equations u +(u )u Δu + p = 0 u = 0 Heat Transfer - the Heat equation Ṫ + u T T = f
Discretizing Partial Differential Equations A PDE Model Problem: Poisson s Equation The typical way to solve PDE s is to discretize them, that is, to approximate them by other (algebraic) equations involving only a finite number of unknowns. The matrix problems arising from these discretizations are generally large and sparse. Two standard techniques for discretizing PDE s are: Finite Difference Methods (FDM) Finite Element Methods (FEM) Consider Poisson s problem: Find a function u = u(x, y) such that ( 2 ) u x 2 + 2 u y 2 = 1 (1) within the unit square Ω={0 < x, y < 1} and such that u = 0 on the boundary Ω. Needless to say, this is a PDE since it is an equation involving the partial derivatives of u. Let us discretize this PDE using finite differences. Finite Differences To obtain a numerical approximation to u the domain Ω is first subdivided into an (N + 1) (N + 1) grid of squares. Finite Differences, cont d Let u(ih, jh) =u i,j for ease of notation. On the grid, derivatives are approximated by difference quotients, e.g., or u i+1/2,j x u i+1,j u i,j h Thus 2 u i,j x 2 u i+1,j 2u i,j + u i 1,j h 2 The grid spacing is h = 1/(N + 1). The vertex points (ih, jh), i, j = 0, 1,...,N + 1 are called nodes. The aim is to compute approximate values of u at the nodes. 2 u i,j x 2 + 2 u i,j y 2 u i 1,j + u i+1,j 4u i,j + u i,j 1 + u i,j+1 h 2 (The expression on the right hand side is sometimes called the 5-point stencil.)
Finite Differences, cont d A finite difference approximation to (1) is defined by 4u i,j u i 1,j u i+1,j u i,j 1 u i,j+1 = h 2 (2) with i, j = 1, 3,...,N. Finite Differences, cont d The resulting linear system Ax = b is sparse, banded, and generally very large since we wish to obtain an accurate solution approximation to the PDE and therefore compute on a fine grid with many nodes. In matlab it is easy to construct A since it is given by the built-in routine delsq. Just type This is N 2 linear equations for the N 2 unknown values u i,j in the interior nodes. (Recall that u = 0 on the boundary.) N = 5; h = 1/(N+1); G = numgrid( S,N+2); % create grid A = delsq(g); % matrix A b = hˆ2*ones(nˆ2,1); % vector b Finite Differences, cont d The distribution of the non-zero entries in a matrix is called the sparsity pattern of the matrix. You can plot the sparsity pattern of a matrix in matlab with the spy command. Finite Differences, cont d Solving the linear system resulting from the finite difference discretization process we get a solution approximation to the PDE. x = A\b; % solve linear system u = G; % map solution onto grid u(g>0) = full(x(g(g>0))); surf(u) % plot Figure: Sparsity pattern of the finite difference matrix A. Figure: Plotted finite difference solution.
Finite Differences, cont d The general idea is that the accuracy of the computed finite difference solution can be improved by using a finer grid with more closely spaced nodes. This implies more computational work since the size of the resulting linear system grows. When doing computations one usually has to strike a balance between the precision of the output and the computational cost. Finite differences are popular because of their simplicity. Their drawbacks are that they cannot handle complex computational domains, and are hard to analyze quantitatively. Finite element methods are easier to analyze and are also able to handle complex geometries but they are more complicated. Both methods however yield large linear systems to be solved, depending on the size of the mesh. The Need for Iterative Methods for Linear Systems Direct methods for solving Ax = b (such as LU factorization, or Gaussian elimination) require roughly n 3 arithmetic operations (i.e., + and *). Thus they often become too expensive (w.r.t. cpu time and memory consumption) to use when dealing with very large linear systems. In such cases iterative methods might work fine, since they are usually memory conserving and fast. Iterative algorithms produce a sequence of approximations x (1), x (2),...,x (k) which (hopefully) converges to the true solution x as k. Basic Iterative Methods for Linear Systems Basic Iterative Methods for Linear Systems, cont d Consider the system of equations Let us split A into Ax = b. A = M K where M is any non-singular matrix and K = M A. Hence, Ax = b becomes (M K )x = b Mx = Kx + b x = M 1 Kx + M 1 b This suggests the iteration scheme: For k = 1, 2, 3,...repeat until convergence. x (k+1) = M 1 Kx (k) + M 1 b Of course, for this iteration to be computationally practical, the splitting of A should be chosen such that M 1 K and M 1 b are easy to calculate. We will study splittings based on the diagonal, and the upper / lower triangular parts of A: A = D U L
Jacobi s Method The Gauss-Seidel Method So-called Jacobi iteration is defined by choosing the splitting A = D (L + U) =M K where D is the diagonal of A, L is the strictly lower triangle of A, and U is the strictly upper triangle of A. The iteration scheme takes the form x (k+1) = D 1 (L + U)x (k) + D 1 b Note that D is easy to invert since it is a diagonal matrix. In the (forward) Gauss-Seidel method A is split into yielding the iteration scheme A =(D L) U = M K x (k+1) =(D L) 1 (Ux (k) + b) Since D L is lower triangular the effect of (D L) 1 can be computed by forward elimination. The (backward) Gauss-Seidel method instead uses A =(D U) L = M K Successive Over-Relaxation (SOR) A more sophisticated method is obtained by choosing A =( 1 ω D L) (1 D + U) =M K ω ω where ω is a relaxation parameter. This gives the iteration scheme x (k+1) =(D ωl) 1 (ωu +(1 ω)d)x (k) + ω(d ωl) 1 b Matlab Code Set up the splitting D=diag(diag(A)) L=-tril(A,-1) U=-triu(A, 1) Jacobi s method for k = 1:10 y = D\((L+U)*x + b) x = y; end Gauss-Seidel s method for k = 1:10 y = (D-L)\(U*x + b) x = y; end
Diagonally Dominant Matrices Convergence An n n matrix A is said to be (strictly) diagonally dominant if a i,i > n j=0,j i a i,j i = 1, 2,...,n that is, if the absolute value of each diagonal element is greater than the sum of the absolute values of the other elements in its row. Example 4 1 0 A = 2 5 1 6 0 7 The success (i.e., convergence) of an iterative method depends on the type of linear system Ax = b it is applied to. Jacobi converges if A is strictly diagonally dominant. Gauss-Seidel converges if A is symmetric and positive definite (SPD). SOR also converges for SPD matrices, but only if the relaxation parameter ω is such that 0 <ω<2. is diagonally dominant. Kacmarz Method So what if the matrix A is non-singular, but unsymmetric or indefinite? In these cases it is possible to apply Gauss-Seidel or SOR to A T Ax = A T b that is, the familiar normal equations. Since A T A is symmetric and positive definite if A is non-singular both Gauss-Seidel and SOR will converge. However the rate of convergence can be slow. Convergence Analysis All the above methods can be written x (k+1) = Rx (k) + c where the iteration matrix R is R = M 1 K and c = M 1 b. A relation between the errors in successive approximations can be be derived by subtracting x = Rx + c: x (k+1) x = R(x (k) x) =...= R k+1 (x (0) x). Taking norms and using the Cauchy inequality we get x (k+1) x R k+1 x (0) x R k+1 x (0) x. Thus a sufficient condition for convergence is that R < 1in any norm.
Convergence Analysis, cont d From the above analysis it is clear that R should be as small as possible since this is the amplification factor for the error in each iteration. Hence the splitting of A should be chosen such that: Rx = M 1 Kx and x = M 1 b are easy to evaluate. R is small. Unfortunately, these goals are contradictory, and a balance has to be struck. For example, 1. M = I makes M 1 trivial, but probably not A I < 1. 2. M = A gives K = 0 and thus R = M 1 K = 0, but then M 1 = A 1 is expensive to compute. Convergence Analysis for Jacobi s Method We claimed before that Jacobi s method converges for strictly diagonally dominant matrices. Let us prove this. In Jacobi s method the iteration matrix R has the elements Taking the infinity norm gives r i,j = a i,j a i,i, i j, r i,i = 0 R = max n 1 i n j=1, j i a i,j a i,i which shows that R < 1ifA is strictly diagonal dominant, and we are done. Storage Schemes for Sparse Matrices Coordinate Format The simplest sparse format is the coordinate format. The data structure consists of three arrays, one for the non-zero matrix values, one for the row indices, and one for the column indices. All arrays are of length nz, the number of non-zeros. In order to take advantage of the large number of zero entries, special schemes are used to store sparse matrices. The main goal is to represent only the non-zero elements, and to be able to perform the common matrix operations, e.g., matrix-vector multiplication. The matrix 1 0 0 2 0 3 4 0 5 0 A = 6 0 7 8 9 0 0 10 11 0 0 0 0 0 12 can (for example) be represented as a=[1 2 3 4 5 6 7 8 9 10 11 12]; r=[1 1 2 2 2 3 3 3 3 4 4 5]; c=[1 4 1 2 4 1 3 4 5 3 4 5];
Compressed Sparse Row Format Compressed Sparse Row Format, cont d Assuming that n nz a more economic storage scheme is the compressed sparse row (CSR) format. It also contains three arrays. A real array (values) of length nz containing all the matrix entries a i,j stored row by row. An integer array (colind) of length nz containing the column indices of a i,j. An integer array (rowptr) of length n + 1 containing pointers to the start of each row within the arrays values and colind. The last place usually contain the number of non-zeros nz, or nz + 1. The matrix is stored in CSR format as 1 0 0 2 0 3 4 0 5 0 A = 6 0 7 8 9 0 0 10 11 0 0 0 0 0 12 values=[1 2 3 4 5 6 7 8 9 10 11 12]; colind=[1 4 1 2 4 1 3 4 5 3 4 5]; rowptr=[1 3 6 10 12 13]; Compressed Sparse Column Format Basic Sparse Matrix Operations The basic operations of linear algebra are more complicated when the matrix is stored in CRS format. There are a number of variations on the CSR format. The most obvious is to store the columns instead of the rows. This is the compressed sparse column (CSC) scheme. The CSC data arrays are matrix values, row index, and column pointers. Consider, for example, matrix-vector multiplication y = Ax: y = zeros(n,1) for i = 1:n for j= rowptr(i):rowptr(i+1)-1 y(i) = y(i) + values(j)*x(colind(j)) end end In the inner loop we traverse the non-zero columns of A in row i, multiply with the corresponding entries of x, and add to y.
Basic Sparse Matrix Operations, cont d Reading Since this method only multiplies non-zero matrix entries, the operations count is O(nz) which is a substantial saving over the dense operation requirement O(n 2 ). As we shall see the basic matrix-vector operation Ax is very important for modern iterative methods, since it is practically the only operation necessary (and affordable). Y Saad Iterative Methods for Sparse Linear Systems 1st edition. The book is free and can be downloaded from the URL: http://www-users.cs.umn.edu/ saad/books.html Chapter 2: 2.1, 2.2, and 2.3 Chapter 3: 3.1, 3.4, and 3.5 Chapter 4: read as much as you like