Sparse Matrices and Iterative Methods K. 1 1 Department of Mathematics 2018
Iterative Methods Consider the problem of solving Ax = b, where A is n n. Why would we use an iterative method? Avoid direct decomposition (LU, QR, Cholesky) Replace with iterated matrix multiplication LU is O(n 3 ) flops...... matrix-vector multiplication is O(n 2 )... so if we can get convergence in e.g. log(n), iteration might be faster.
Jacobi, GS, SOR Some old methods: Jacobi is easily parallelized...... but converges extremely slowly. Gauss-Seidel/SOR converge faster...... but cannot be effectively parallelized. Only Jacobi really takes advantage of sparsity.
When a matrix is sparse (many more zero entries than nonzero), then typically the number of nonzero entries is O(n), so matrix-vector multiplication becomes an O(n) operation. This makes iterative methods very attractive. It does not help direct solves as much because of the problem of fill-in, but we note that there are specialized solvers to minimize fill-in.
Krylov Subspace Methods A class of methods that converge in n iterations (in exact arithmetic). We hope that they arrive at a solution that is close enough in fewer iterations. Often these work much better than the classic methods. They are more readily parallelized, and take full advantage of sparsity.
Possibilities Sparse matrices are quite common in computation Finite differences for PDEs Finite element for PDEs Integral equations with localized kernels
Structures Nonzero elements shown in blue. Note: nz denotes number of nonzeros out of 17 million entries.
Some formats There are a few ways to store sparse matrices that are obvious. Diagonals - 1+ entry per nonzero Coordinates - 3 entries per nonzero Row- or column-oriented coordinates - 2+ entries per nonzero
Diagonal (DIA) Matrix Sparse Storage 1 0 0 0 2 0 1 3 1 0 3 4 0 0 0 5 4 6 4 0 0 6 7 0 0 0 0 0 8 9 0 0 0 8 7 0 0 2 9 0 1 0 0 2 3 0 0 5 3 2 0 4 0 0 5 6 0 0 6 5 Offsets: [ 3 1 0 3 ]
Coordinate (COO) Matrix 1 0 0 0 2 0 3 4 0 0 0 5 0 6 7 0 0 0 0 0 8 9 0 0 1 0 0 2 3 0 0 4 0 0 5 6 Sparse Storage [ 0 0 1 1 1... 5 5 5 ] [ 0 4 0 1 5... 1 4 5 ] [ 1 2 3 4 5... 4 5 6 ] rows cols vals Often used for conversions...
Compressed Sparse Row (CSR) Matrix 1 0 0 0 2 0 3 4 0 0 0 5 0 6 7 0 0 0 0 0 8 9 0 0 1 0 0 2 3 0 0 4 0 0 5 6 Sparse Storage [ 0 4 0 1 5... 1 4 5 ] [ 1 2 3 4 5... 4 5 6 ] [ 0 2 5 7 9 12 ] row cols vals offsets Also Compressed Sparse Column format, for multiplications.
Sparse Package Scipy has a subpackage called sparse that implements many of these formats. Diagonal: dia_matrix() Coordinate: coo_matrix() CSR, CSC: csr_matrix(), csc_matrix()... and others.
Using Sparse from scipy import * from scipy.sparse import csr_matrix A = csr_matrix([[-1,1,0,0],[0,-2,0,0],[0,-3,0,5],\ [0,0,1,1]]) x = array([1, 0, -1,0]) y = A.dot(x) print(y) print(a) Results in y=[-1,0,0,-1]. A prints in COO format. Note that dot(a,x) does not work. dot must be the method of the sparse object.
Example As a very simple example of the efficacy of the sparse matrix package in scipy, consider the PDE x = 1, x Ω = 0, where the region Ω is the unit square. We solve this numerically using finite differences.
Matrix There are many ways to assemble the matrix. Here is one. N = 100 Nsq = N*N h = 1.0/float(N+1) offsets = [-N,-1,0,1,N] subdiag1 = ones(nsq) subdiag1[n-1:nsq:n] = 0. supdiag1 = ones(nsq) supdiag1[0:nsq:n] = 0. A = dia_matrix(([-ones(nsq),-subdiag1,4.*ones(nsq),\ -supdiag1,-ones(nsq)],offsets),shape=(nsq,nsq))
Conversion It is easy to convert to other formats: e.g. given our DIA format matrix A, we can get other formats using: Acsr = A.tocsr() Afull = A.toarray()
Solve We can solve the systems using various methods. Given: from scipy.linalg import solve as lsolve import scipy.sparse.linalg as sp Sparse Conjugate Gradient: soln = sp.cg(a,h*h*ones(nsq)) Sparse LU: solnsp = sp.spsolve(acsr,h*h*ones(nsq)) Full LU: solnfull = lsolve(afull,h*h*ones(nsq))
Results... for a 100 100 system of interior finite difference points (i.e. 10000 10000 matrix) Sparse CG: 0.037 seconds, 2.7e-7 difference from full Sparse LU: 0.056 seconds, 1.9e-13 difference from full Full LU: 15.2 seconds, 0 difference Of course, for a more serious problem we would precondition etc.