Contents. Preface... xi. Introduction...

Similar documents
Preface to the Second Edition. Preface to the First Edition

ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS

APPLIED NUMERICAL LINEAR ALGEBRA

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

Numerical Methods in Matrix Computations

9.1 Preconditioned Krylov Subspace Methods

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

6.4 Krylov Subspaces and Conjugate Gradients

Iterative methods for Linear System

Algebraic Multigrid as Solvers and as Preconditioner

Contents. Preface for the Instructor. Preface for the Student. xvii. Acknowledgments. 1 Vector Spaces 1 1.A R n and C n 2

Introduction to Applied Linear Algebra with MATLAB

The Conjugate Gradient Method

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Chapter 7 Iterative Techniques in Matrix Algebra

1 Number Systems and Errors 1

NUMERICAL METHODS FOR ENGINEERING APPLICATION

Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method

Iterative Methods and Multigrid

FEM and sparse linear system solving

Numerical Mathematics

M.A. Botchev. September 5, 2014

Applied Linear Algebra

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Lecture 18 Classical Iterative Methods

Parallelization of Multilevel Preconditioners Constructed from Inverse-Based ILUs on Shared-Memory Multiprocessors

Course Notes: Week 1

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

Index. for generalized eigenvalue problem, butterfly form, 211

Iterative Methods for Linear Systems of Equations

Introduction to Numerical Analysis

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Today s class. Linear Algebraic Equations LU Decomposition. Numerical Methods, Fall 2011 Lecture 8. Prof. Jinbo Bi CSE, UConn

Krylov Space Solvers

Solving Ax = b, an overview. Program

Matrix Algorithms. Volume II: Eigensystems. G. W. Stewart H1HJ1L. University of Maryland College Park, Maryland

Linear Solvers. Andrew Hazel

NUMERICAL COMPUTATION IN SCIENCE AND ENGINEERING

Preface. 2 Linear Equations and Eigenvalue Problem 22

Preface to Second Edition... vii. Preface to First Edition...

Introduction to Scientific Computing

Iterative Methods for Sparse Linear Systems

Conjugate Gradients: Idea

Solving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

A THEORETICAL INTRODUCTION TO NUMERICAL ANALYSIS

Simple iteration procedure

Solving Sparse Linear Systems: Iterative methods

Solving Sparse Linear Systems: Iterative methods

Linear Algebraic Equations

Numerical Methods - Numerical Linear Algebra

GEOPHYSICAL INVERSE THEORY AND REGULARIZATION PROBLEMS

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

Computational Linear Algebra

Direct and Incomplete Cholesky Factorizations with Static Supernodes

Matlab s Krylov Methods Library. For. Large Sparse. Ax = b Problems

Lecture 8: Fast Linear Solvers (Part 7)

Parallel Numerics, WT 2016/ Iterative Methods for Sparse Linear Systems of Equations. page 1 of 1

1 Extrapolation: A Hint of Things to Come

Iterative Methods for Solving A x = b

AMS526: Numerical Analysis I (Numerical Linear Algebra)

BLOCK KRYLOV SPACE METHODS FOR LINEAR SYSTEMS WITH MULTIPLE RIGHT-HAND SIDES: AN INTRODUCTION

Multigrid absolute value preconditioning

Incomplete Cholesky preconditioners that exploit the low-rank property

Solving Large Nonlinear Sparse Systems

High Performance Computing for the Efficient Solution of PDEs on arbitrary domains

1 Conjugate gradients

COURSE DESCRIPTIONS. 1 of 5 8/21/2008 3:15 PM. (S) = Spring and (F) = Fall. All courses are 3 semester hours, unless otherwise noted.

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

The Lanczos and conjugate gradient algorithms

Computational Linear Algebra

A Robust Preconditioned Iterative Method for the Navier-Stokes Equations with High Reynolds Numbers

Comparison of Fixed Point Methods and Krylov Subspace Methods Solving Convection-Diffusion Equations

arxiv: v4 [math.na] 1 Sep 2018

Lecture 17: Iterative Methods and Sparse Linear Algebra

Some minimization problems

Parallel Iterative Methods for Sparse Linear Systems. H. Martin Bücker Lehrstuhl für Hochleistungsrechnen

FEM and Sparse Linear System Solving

Matrix Computations and Semiseparable Matrices

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White

Solving PDEs with CUDA Jonathan Cohen

Efficient Deflation for Communication-Avoiding Krylov Subspace Methods

LINEAR AND NONLINEAR PROGRAMMING

Iterative Methods for Sparse Linear Systems

Domain decomposition in the Jacobi-Davidson method for eigenproblems

Index. higher order methods, 52 nonlinear, 36 with variable coefficients, 34 Burgers equation, 234 BVP, see boundary value problems

A High-Performance Parallel Hybrid Method for Large Sparse Linear Systems

Finite-choice algorithm optimization in Conjugate Gradients

Parallel Programming. Parallel algorithms Linear systems solvers

Enhancing Scalability of Sparse Direct Methods

In order to solve the linear system KL M N when K is nonsymmetric, we can solve the equivalent system

An advanced ILU preconditioner for the incompressible Navier-Stokes equations

Introduction. Chapter One

Lab 1: Iterative Methods for Solving Linear Systems

Computational Linear Algebra

Transcription:

Contents Preface... xi Introduction... xv Chapter 1. Computer Architectures... 1 1.1. Different types of parallelism... 1 1.1.1. Overlap, concurrency and parallelism... 1 1.1.2. Temporal and spatial parallelism for arithmetic logic units... 4 1.1.3. Parallelism and memory... 6 1.2. Memory architecture... 7 1.2.1. Interleaved multi-bank memory... 7 1.2.2. Memory hierarchy... 8 1.2.3. Distributed memory... 13 1.3. Hybrid architecture... 14 1.3.1. Graphics-type accelerators... 14 1.3.2. Hybrid computers... 16 Chapter 2. Parallelization and Programming Models... 17 2.1. Parallelization... 17 2.2. Performance criteria... 19 2.2.1. Degree of parallelism... 19 2.2.2. Load balancing... 21 2.2.3. Granularity... 21 2.2.4. Scalability... 22

vi Parallel Scientific Computing 2.3. Data parallelism... 25 2.3.1. Loop tasks... 25 2.3.2. Dependencies... 26 2.3.3. Examples of dependence... 27 2.3.4. Reduction operations... 30 2.3.5. Nested loops... 31 2.3.6. OpenMP... 34 2.4. Vectorization: a case study... 37 2.4.1. Vector computers and vectorization... 37 2.4.2. Dependence... 38 2.4.3. Reduction operations... 39 2.4.4. Pipeline operations... 41 2.5. Message-passing... 43 2.5.1. Message-passing programming... 43 2.5.2. Parallel environment management... 44 2.5.3. Point-to-point communications... 45 2.5.4. Collective communications... 46 2.6. Performance analysis... 49 Chapter 3. Parallel Algorithm Concepts... 53 3.1. Parallel algorithms for recurrences... 54 3.1.1. The principles of reduction methods... 54 3.1.2. Overhead and stability of reduction methods... 55 3.1.3. Cyclic reduction... 57 3.2. Data locality and distribution: product of matrices... 58 3.2.1. Row and column algorithms... 58 3.2.2. Block algorithms... 60 3.2.3. Distributed algorithms... 64 3.2.4. Implementation... 66 Chapter 4. Basics of Numerical Matrix Analysis... 71 4.1. Review of basic notions of linear algebra... 71 4.1.1. Vector spaces, scalar products and orthogonal projection... 71 4.1.2. Linear applications and matrices... 74 4.2. Properties of matrices... 79 4.2.1. Matrices, eigenvalues and eigenvectors... 79 4.2.2. Norms of a matrix... 80

Contents vii 4.2.3. Basis change... 83 4.2.4. Conditioning of a matrix... 85 Chapter 5. Sparse Matrices... 93 5.1. Origins of sparse matrices... 93 5.2. Parallel formation of sparse matrices: shared memory... 98 5.3. Parallel formation by block of sparse matrices: distributed memory... 99 5.3.1. Parallelization by sets of vertices... 99 5.3.2. Parallelization by sets of elements... 101 5.3.3. Comparison: sets of vertices and elements... 101 Chapter 6. Solving Linear Systems... 105 6.1. Direct methods... 105 6.2. Iterative methods... 106 Chapter 7. LU Methods for Solving Linear Systems... 109 7.1. Principle of LU decomposition... 109 7.2. Gauss factorization... 113 7.3. Gauss Jordan factorization... 115 7.3.1. Row pivoting... 118 7.4. Crout and Cholesky factorizations for symmetric matrices... 121 Chapter 8. Parallelization of LU Methods for Dense Matrices... 125 8.1. Block factorization... 125 8.2. Implementation of block factorization in a message-passing environment... 130 8.3. Parallelization of forward and backward substitutions... 135 Chapter 9. LU Methods for Sparse Matrices... 139 9.1. Structure of factorized matrices... 139 9.2. Symbolic factorization and renumbering... 142 9.3. Elimination trees... 147 9.4. Elimination trees and dependencies... 152 9.5. Nested dissections... 153 9.6. Forward and backward substitutions... 159

viii Parallel Scientific Computing Chapter 10. Basics of Krylov Subspaces... 161 10.1. Krylov subspaces... 161 10.2. Construction of the Arnoldi basis... 164 Chapter 11. Methods with Complete Orthogonalization for Symmetric Positive Definite Matrices... 167 11.1. Construction of the Lanczos basis for symmetric matrices... 167 11.2. The Lanczos method... 168 11.3. The conjugate gradient method... 173 11.4. Comparison with the gradient method... 177 11.5. Principle of preconditioning for symmetric positive definite matrices... 180 Chapter 12. Exact Orthogonalization Methods for Arbitrary Matrices... 185 12.1. The GMRES method... 185 12.2. The case of symmetric matrices: the MINRES method... 193 12.3. The ORTHODIR method... 196 12.4. Principle of preconditioning for non-symmetric matrices... 198 Chapter 13. Biorthogonalization Methods for Non-symmetric Matrices... 201 13.1. Lanczos biorthogonal basis for non-symmetric matrices... 201 13.2. The non-symmetric Lanczos method... 206 13.3. The biconjugate gradient method: BiCG... 207 13.4. The quasi-minimal residual method: QMR... 211 13.5. The BiCGSTAB... 217 Chapter 14. Parallelization of Krylov Methods... 225 14.1. Parallelization of dense matrix-vector product... 225 14.2. Parallelization of sparse matrix-vector product based on node sets... 227 14.3. Parallelization of sparse matrix-vector product based on element sets... 229

Contents ix 14.3.1. Review of the principles of domain decomposition... 229 14.3.2. Matrix-vector product... 231 14.3.3. Interface exchanges... 233 14.3.4. Asynchronous matrix-vector product with non-blocking communications... 236 14.3.5. Comparison: parallelization based on node and element sets... 236 14.4. Parallelization of the scalar product... 238 14.4.1. By weight... 239 14.4.2. By distributivity... 239 14.4.3. By ownership... 240 14.5. Summary of the parallelization of Krylov methods... 241 Chapter 15. Parallel Preconditioning Methods... 243 15.1. Diagonal... 243 15.2. Incomplete factorization methods... 245 15.2.1. Principle... 245 15.2.2. Parallelization... 248 15.3. Schur complement method... 250 15.3.1. Optimal local preconditioning... 250 15.3.2. Principle of the Schur complement method... 251 15.3.3. Properties of the Schur complement method... 254 15.4. Algebraic multigrid... 257 15.4.1. Preconditioning using projection... 257 15.4.2. Algebraic construction of a coarse grid... 258 15.4.3. Algebraic multigrid methods... 261 15.5. The Schwarz additive method of preconditioning... 263 15.5.1. Principle of the overlap... 263 15.5.2. Multiplicative versus additive Schwarz methods... 265 15.5.3. Additive Schwarz preconditioning... 268 15.5.4. Restricted additive Schwarz: parallel implementation... 269 15.6. Preconditioners based on the physics... 275 15.6.1. Gauss Seidel method... 275 15.6.2. Linelet method... 276

x Parallel Scientific Computing Appendices... 279 Appendix 1... 281 Appendix 2... 301 Appendix 3... 323 Bibliography... 339 Index... 343