AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Similar documents
AMS526: Numerical Analysis I (Numerical Linear Algebra)

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

Lecture 18 Classical Iterative Methods

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Numerical Methods I Non-Square and Sparse Linear Systems

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

Numerical Methods in Matrix Computations

The Conjugate Gradient Method

Numerical Linear Algebra

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

Scientific Computing

Scientific Computing: An Introductory Survey

Chapter 7 Iterative Techniques in Matrix Algebra

AMS526: Numerical Analysis I (Numerical Linear Algebra)

9. Iterative Methods for Large Linear Systems

APPLIED NUMERICAL LINEAR ALGEBRA

AM205: Assignment 2. i=1

Lecture 17: Iterative Methods and Sparse Linear Algebra

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Contents. Preface... xi. Introduction...

5.1 Banded Storage. u = temperature. The five-point difference operator. uh (x, y + h) 2u h (x, y)+u h (x, y h) uh (x + h, y) 2u h (x, y)+u h (x h, y)

Parallel Numerics, WT 2016/ Iterative Methods for Sparse Linear Systems of Equations. page 1 of 1

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

9.1 Preconditioned Krylov Subspace Methods

Solving linear systems (6 lectures)

Iterative Methods. Splitting Methods

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.

Lecture 9: Numerical Linear Algebra Primer (February 11st)

Preface to the Second Edition. Preface to the First Edition

Iterative Methods for Solving A x = b

Numerical Methods - Numerical Linear Algebra

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Stabilization and Acceleration of Algebraic Multigrid Method

Sparse Linear Systems. Iterative Methods for Sparse Linear Systems. Motivation for Studying Sparse Linear Systems. Partial Differential Equations

Lab 1: Iterative Methods for Solving Linear Systems

AMS526: Numerical Analysis I (Numerical Linear Algebra)

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 0

6. Iterative Methods for Linear Systems. The stepwise approach to the solution...

MAA507, Power method, QR-method and sparse matrix representation.

Course Notes: Week 1

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Sparse Matrices and Iterative Methods

Classical iterative methods for linear systems

Linear Algebra. Brigitte Bidégaray-Fesquet. MSIAM, September Univ. Grenoble Alpes, Laboratoire Jean Kuntzmann, Grenoble.

Computational Methods. Systems of Linear Equations

Iterative methods for Linear System

Algebraic Multigrid as Solvers and as Preconditioner

Solving PDEs with CUDA Jonathan Cohen

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

Solving PDEs with Multigrid Methods p.1

AMS526: Numerical Analysis I (Numerical Linear Algebra)

MS&E 318 (CME 338) Large-Scale Numerical Optimization

Block Bidiagonal Decomposition and Least Squares Problems

From Stationary Methods to Krylov Subspaces

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)

6.4 Krylov Subspaces and Conjugate Gradients

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

A Method for Constructing Diagonally Dominant Preconditioners based on Jacobi Rotations

Linear Solvers. Andrew Hazel

Preconditioning Techniques Analysis for CG Method

Incomplete Cholesky preconditioners that exploit the low-rank property

The Lanczos and conjugate gradient algorithms

Numerical linear algebra

Introduction to Scientific Computing

Sparsity-Preserving Difference of Positive Semidefinite Matrix Representation of Indefinite Matrices

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White

Iterative Methods and Multigrid

Solving Ax = b, an overview. Program

Parallel Iterative Methods for Sparse Linear Systems. H. Martin Bücker Lehrstuhl für Hochleistungsrechnen

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

Solving Large Nonlinear Sparse Systems

Motivation: Sparse matrices and numerical PDE's

A Robust Preconditioned Iterative Method for the Navier-Stokes Equations with High Reynolds Numbers

Review of matrices. Let m, n IN. A rectangle of numbers written like A =

LU Factorization. Marco Chiarandini. DM559 Linear and Integer Programming. Department of Mathematics & Computer Science University of Southern Denmark

Topics in Numerical Linear Algebra

Using semiseparable matrices to compute the SVD of a general matrix product/quotient

Department of Mathematics California State University, Los Angeles Master s Degree Comprehensive Examination in. NUMERICAL ANALYSIS Spring 2015

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

Parallel Singular Value Decomposition. Jiaxing Tan

Linear algebra & Numerical Analysis

Numerical Solution Techniques in Mechanical and Aerospace Engineering

Lecture 13 Stability of LU Factorization; Cholesky Factorization. Songting Luo. Department of Mathematics Iowa State University

Iterative Methods for Sparse Linear Systems

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS 209, Fall 2015 Final Project Type A Numerical Linear Algebra: Gaussian Elimination with Pivoting for Solving Linear Systems

Incomplete Factorization Preconditioners for Least Squares and Linear and Quadratic Programming

Numerical Mathematics

Preconditioning Techniques for Large Linear Systems Part III: General-Purpose Algebraic Preconditioners

1. Fast Iterative Solvers of SLE

An advanced ILU preconditioner for the incompressible Navier-Stokes equations

FINE-GRAINED PARALLEL INCOMPLETE LU FACTORIZATION

Math 471 (Numerical methods) Chapter 3 (second half). System of equations

Computational Methods. Eigenvalues and Singular Values

Transcription:

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 19: Computing the SVD; Sparse Linear Systems Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 34

Outline 1 Computing the SVD (NLA 31) 2 Sparse Storage Format 3 Direct Methods for Sparse Linear Systems (MC 11.1-11.2) 4 Overview of Iterative Methods for Sparse Linear Systems Xiangmin Jiao Numerical Analysis I 2 / 34

SVD of A and Eigenvalues of A A Intuitive idea for computing SVD of A R m n : Form A A and compute its eigenvalue decomposition A A = V ΛV Let Σ = Λ, i.e., diag( λ 1, λ 2,..., λ n ) Solve system UΣ = AV to obtain U This method is efficient if m n. However, it may not be stable, especially for smaller singular values because of the squaring of the condition number For SVD of A, σk σ k = O(ɛ machine A ), where σ k and σ k denote the computed and exact kth singular value If computed from eigenvalue decomposition of A A, σ k σ k = O(ɛ machine A 2 /σ k ), which is problematic if σ k A If one is interested in only relatively large singular values, then computing eigenvalues of A A is not a problem. For general situations, a more stable algorithm is desired. Xiangmin Jiao Numerical Analysis I 3 / 34

A Different Reduction to Eigenvalue Problem Typical algorithm for computing SVD are similar to computation of eigenvalues [ ] 0 A Consider A C m n, then Hermitian matrix H = has A 0 eigenvalue decomposition [ ] [ ] [ ] V V V V Σ 0 H =, U U U U 0 Σ where A = UΣV gives the SVD. This approach is stable. In practice, such a reduction is done implicitly without forming the large matrix Typically done in two phases Xiangmin Jiao Numerical Analysis I 4 / 34

Two-Phase Method In the first phase, reduce to bidiagonal form by applying different orthogonal transformations on left and right, which involves O(mn 2 ) operations In the second phase, reduce to diagonal form using a variant of QR algorithm or divide-and-conquer algorithm, which involves O(n 2 ) operations for fixed precision We hereafter focus on the first phase Xiangmin Jiao Numerical Analysis I 5 / 34

Golub-Kahan Bidiagonalization Apply Householder reflectors on both left and right sides Work for Golub-Kahan bidiagonalization 4mn 2 4 3 n3 flops Xiangmin Jiao Numerical Analysis I 6 / 34

Lawson-Hanson-Chan Bidiagonalization Speed up by first performing QR factorization on A Work for LHC bidiagonalization 2mn 2 + 2n 3 flops, which is advantageous if m 5 3 n Xiangmin Jiao Numerical Analysis I 7 / 34

Three-Step Bidiagonalization Hybrid approach: Apply QR at suitable time on submatrix with 5/3 aspect ratio Work for three-step bidiagonalization is 4mn 2 4 3 n3 2 3 (m n)3 Xiangmin Jiao Numerical Analysis I 8 / 34

Comparison of Performance One-step (Golub-Kahan) Two-step (LHC) Three-step bidiagonalization Xiangmin Jiao Numerical Analysis I 9 / 34

Outline 1 Computing the SVD (NLA 31) 2 Sparse Storage Format 3 Direct Methods for Sparse Linear Systems (MC 11.1-11.2) 4 Overview of Iterative Methods for Sparse Linear Systems Xiangmin Jiao Numerical Analysis I 10 / 34

Sparse Linear System Boundary value problems and implicit methods for time-dependent PDEs yield systems of linear algebraic equations to solve A matrix is sparse if it has relatively few nonzeros in its entries Sparsity can be exploited to use far less than O(n 2 ) storage and O(n 3 ) work required in standard approach to solving system with dense matrix, assuming matrix is n n Xiangmin Jiao Numerical Analysis I 11 / 34

Storage Format of Sparse Matrices Sparse-matrices are typically stored in special formats that store only nonzero entries, along with indices to identify their locations in matrix, such as compressed-row storage (CRS) compressed-column storage (CCS) block compressed row storage (BCRS) Banded matrices have their own special storage formats (such as Compressed Diagonal Storage (CDS)) See survey at http://netlib.org/linalg/html_templates/node90.html Explicitly storing indices incurs additional storage overhead and makes arithmetic operations on nonzeros less efficient due to indirect addressing to access operands, so they are beneficial only for very sparse matrices Storage format can have big impact the effectiveness of different versions of same algorithm (with different ordering of loops) Besides direct methods, these storage formats are also important in implementing iterative and multigrid solvers Xiangmin Jiao Numerical Analysis I 12 / 34

Example of Compressed-Row Storage (CRS) Xiangmin Jiao Numerical Analysis I 13 / 34

Outline 1 Computing the SVD (NLA 31) 2 Sparse Storage Format 3 Direct Methods for Sparse Linear Systems (MC 11.1-11.2) 4 Overview of Iterative Methods for Sparse Linear Systems Xiangmin Jiao Numerical Analysis I 14 / 34

Banded Linear Systems Cost of factorizing banded linear system depends on bandwidth For SPD n n matrix with semi-bandwidth s, total flop count of Cholesky factorization is about ns 2 For n n matrix with lower bandwidth p and upper bandwidth q, In A = LU (LU without pivoting), total flop count is about 2npq In PA = LU (LU with column pivoting), total flop count is about 2np(p + q) Banded matrices have their own special storage formats (such as Compressed Diagonal Storage (CDS)) Xiangmin Jiao Numerical Analysis I 15 / 34

Fill When applying LU or Cholesky factorization to general sparse matrix, taking linear combinations of rows or columns to annihilate unwanted nonzero entries can introduce new nonzeros into matrix locations that were initially zero Such new nonzeros, called fill or fill-in, must be stored and may themselves eventually need to be annihilated in order to obtain triangular factors Resulting triangular factors can be expected to contain at least as many nonzeros as original matrix and usually significant fill as well Xiangmin Jiao Numerical Analysis I 16 / 34

Sparse Cholesky Factorization In general, some heuristic algorithms are employed to reorder the matrix to reduce fills Amount of fill is sensitive to order in which rows and columns of matrix are processed, so basic problem in sparse factorization is reordering matrix to limit fill during factorization Exact minimization of fill is hard combinatorial problem (NP-complete), but heuristic algorithms such as minimum degree and nested dissection limit fill well for many types of problems For Cholesky factorization, both rows and columns are reordered Xiangmin Jiao Numerical Analysis I 17 / 34

Graph Model of Elimination Each step of factorization process corresponds to elimination of one node from graph Eliminating node causes its neighboring nodes to become connected to each other If any such neighbors were not already connected, then fill results (new edges in graph and new nonzeros in matrix) Commonly used reordering methods include Cuthill-McKee, approximate minimum degree ordering (AMD) and nested dissection Xiangmin Jiao Numerical Analysis I 18 / 34

Reordering to Reduce Bandwidth The Cuthill-McKee algorithm and reverse Cuthill-McKee algorithm The Cuthill-McKee algorithm is a variant of the breadth-first search algorithm on graphs. Starts with a peripheral node Generates levels R i for i = 1, 2,... until all nodes are exhausted The set R i+1 is created from set R i by listing all vertices adjacent to all nodes in R i Within each level, nodes are listed in increasing degree The reverse Cuthill McKee algorithm (RCM) reserves the resulting index numbers Xiangmin Jiao Numerical Analysis I 19 / 34

Approximate Minimum Degree Ordering Good heuristic for limiting fill is to eliminate first those nodes having fewest neighbors Number of neighbors is called degree of node, so heuristic is known as minimum degree At each step, select node of smallest degree for elimination, breaking ties arbitrarily After node has been eliminated, its neighbors become connected to each other, so degrees of some nodes may change Process is then repeated, with new node of minimum degree eliminated next, and so on until all nodes have been eliminated Xiangmin Jiao Numerical Analysis I 20 / 34

Minimum Degree Ordering, continued Cholesky factor suffers much less fill than with original ordering, and advantage grows with problem size Sophisticated versions of minimum degree are among most effective general-purpose orderings known Xiangmin Jiao Numerical Analysis I 21 / 34

Comparison of Different Orderings of Example Matrix Left: Nonzero pattern of matrix A. Right: Nonzero pattern of matrix R. Xiangmin Jiao Numerical Analysis I 22 / 34

Nested Dissection Ordering Nested dissection is based on divide-and-conquer First, small set of nodes is selected whose removal splits graph into two pieces of roughly equal size No node in either piece is connected to any node in other, so no fill occurs in either piece due to elimination of any node in the other Separator nodes are numbered last, then process is repeated recursively on each remaining piece of graph until all nodes have been numbered Xiangmin Jiao Numerical Analysis I 23 / 34

Nested Dissection Ordering Continued Dissection induces blocks of zeros in matrix that are automatically preserved during factorization Recursive nature of algorithm can be seen in hierarchical block structure of matrix, which would involve many more levels in larger problems Again, Cholesky factor suffers much less fill than with original ordering, and advantage grows with problem size Xiangmin Jiao Numerical Analysis I 24 / 34

Sparse Gaussian Elimination For Gaussian elimination, only columns are reordered Pivoting introduces additional fills in sparse Gaussian elimination Reordering may be done dynamically or statically The reverse Cuthill-McKee algorithm applied to A + A T may be used to reduce bandwidth Column approximate minimum-degree, may be employed to reorder matrix to reduce fills Xiangmin Jiao Numerical Analysis I 25 / 34

Nonzero pattern of A and L + U with column AMD ordering. Xiangmin Jiao Numerical Analysis I 26 / 34 Comparison of Different Orderings of Example Matrix Nonzero pattern of A and L + U with random ordering.

Comparison of Direct Methods Computational cost for Laplace equation on k k( k) grid with n unknowns method 2-D 3-D dense Cholesky k 6 n 3 k 9 n 3 banded Cholesky k 4 n 2 k 7 n 2.33 sparse Cholesky k 3 n 1.5 k 6 n 2 Reference: Michael T. Heath, Scientific Computing: An Introductory Survey, 2nd Edition, McGraw-Hill, 2002. Xiangmin Jiao Numerical Analysis I 27 / 34

Software of Sparse Solvers Additional implementation complexities include cache performance and parallelism It is advisable to use software packages MATLAB has its own sparse solvers if matrix is stored in sparse format Sparse matrix is created by using the sparse function Reordering is implemented as symrcm, symamd, and colamd For symmetric matrices, a good software is Taucs For non-symmetric matrices, a good software is SuperLU Xiangmin Jiao Numerical Analysis I 28 / 34

Outline 1 Computing the SVD (NLA 31) 2 Sparse Storage Format 3 Direct Methods for Sparse Linear Systems (MC 11.1-11.2) 4 Overview of Iterative Methods for Sparse Linear Systems Xiangmin Jiao Numerical Analysis I 29 / 34

Direct vs. Iterative Methods Direct methods, or noniterative methods, compute the exact solution after a finite number of steps (in exact arithmetic) Example: Gaussian elimination, QR factorization Iterative methods produce a sequence of approximations x (1), x (2),... that hopefully converge to the true solution Example: Jacobi, Conjugate Gradient (CG), GMRES, BiCG, etc. Caution: The boundary between direct and iterative methods is vague sometimes Why use iterative methods (instead of direct methods)? may be faster than direct methods produce useful intermediate results handle sparse matrices more easily (needs only matrix-vector product) often are easier to implement on parallel computers Question: When not to use iterative methods? Xiangmin Jiao Numerical Analysis I 30 / 34

Two Classes of Iterative Methods Stationary iterative methods is a fixed point iteration obtained by matrix splitting Examples: Jacobi (for linear systems, not Jacobi iterations for eigenvalues), Gauss-Seidel, Successive Over-Relaxation (SOR) etc. Krylov subspace methods find optimal solution in Krylov subspace {b, Ab, A 2 b, A k b} Build subspace successively Example: Conjugate Gradient (CG), Generalized Minimum Residual (GMRES), BiCG, etc. We will focus on Krylov subspace methods Xiangmin Jiao Numerical Analysis I 31 / 34

Stationary Iterative Methods Stationary iterative methods find a splitting A = M N and iterates x k+1 = M 1 (Nx k + b) Suppose r k = b Ax k, we have x = x k + A 1 r k. Stationary iterative method approximates it by because x k+1 = x k + M 1 r k x k+1 = M 1 Nx k + M 1 b = M 1 Nx k + M 1 (r k + Ax k ) = M 1 (N + A)x k + M 1 r k = x k + M 1 r k A stationary iterative method is good if ρ(m 1 N) < 1, and M 1 is a good approximation to A 1 Xiangmin Jiao Numerical Analysis I 32 / 34

Stationary Iterative Methods Different choices of splitting will lead to various schemes Let A = L + D + U, where D is diagonal, L is strictly lower triangular, and U is strictly upper triangular Jacobi iteration: M = D, works well if A is diagonally dominant Gauss-Seidel: M = L + D, works well if A is SPD Successive Over-Relaxation (SOR): M = 1 ω D + L, where 1 ω < 2, converges quickly proper choice of ω Symmetric SOR: symmetric version of SOR These methods work for some problems, but they may converge slowly Nevertheless, stationary methods are important as preconditioners for Krylov-subspace methods and smoothers in multigrid methods (later) Xiangmin Jiao Numerical Analysis I 33 / 34

Stationary Iterative Methods Example For 2D Poisson equation, spectral radius of Jacobi iteration matrix is cos ( ) ( π n 1 O 1 ) n. Number of iterations required to achieve ɛ is 2 O(n 2 ln ɛ 1 ). After 5 Jacobi iterations on a Poisson equation, error decreases very slowly. Xiangmin Jiao Numerical Analysis I 34 / 34