Iterative methods for Linear System

Similar documents
Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

6.4 Krylov Subspaces and Conjugate Gradients

Chapter 7 Iterative Techniques in Matrix Algebra

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

Lecture 18 Classical Iterative Methods

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

Iterative Methods for Solving A x = b

AMS526: Numerical Analysis I (Numerical Linear Algebra)

M.A. Botchev. September 5, 2014

Solving Sparse Linear Systems: Iterative methods

Solving Sparse Linear Systems: Iterative methods

Preface to the Second Edition. Preface to the First Edition

EECS 275 Matrix Computation

9.1 Preconditioned Krylov Subspace Methods

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Iterative Methods for Sparse Linear Systems

Numerical Methods in Matrix Computations

FEM and sparse linear system solving

A DISSERTATION. Extensions of the Conjugate Residual Method. by Tomohiro Sogabe. Presented to

Chapter 7. Iterative methods for large sparse linear systems. 7.1 Sparse matrix algebra. Large sparse matrices

Lecture 17 Methods for System of Linear Equations: Part 2. Songting Luo. Department of Mathematics Iowa State University

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.

4.6 Iterative Solvers for Linear Systems

Lab 1: Iterative Methods for Solving Linear Systems

Iterative Methods for Linear Systems of Equations

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)

Contents. Preface... xi. Introduction...

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

KRYLOV SUBSPACE ITERATION

Numerical Methods - Numerical Linear Algebra

The Conjugate Gradient Method

Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method

Linear Algebra Massoud Malek

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

AMSC 600 /CMSC 760 Advanced Linear Numerical Analysis Fall 2007 Krylov Minimization and Projection (KMP) Dianne P. O Leary c 2006, 2007.

ITERATIVE METHODS FOR SPARSE LINEAR SYSTEMS

Linear Algebra. Brigitte Bidégaray-Fesquet. MSIAM, September Univ. Grenoble Alpes, Laboratoire Jean Kuntzmann, Grenoble.

Course Notes: Week 1

Iterative Methods and Multigrid

Parallel Numerics, WT 2016/ Iterative Methods for Sparse Linear Systems of Equations. page 1 of 1

Computational Linear Algebra

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

Algorithms that use the Arnoldi Basis

Introduction to Iterative Solvers of Linear Systems

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

1 Extrapolation: A Hint of Things to Come

Krylov Subspace Methods that Are Based on the Minimization of the Residual

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Numerical Methods I Non-Square and Sparse Linear Systems

Conjugate Gradient (CG) Method

Computational Linear Algebra

9. Iterative Methods for Large Linear Systems

The Lanczos and conjugate gradient algorithms

On prescribing Ritz values and GMRES residual norms generated by Arnoldi processes

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Iterative techniques in matrix algebra

7.3 The Jacobi and Gauss-Siedel Iterative Techniques. Problem: To solve Ax = b for A R n n. Methodology: Iteratively approximate solution x. No GEPP.

LARGE SPARSE EIGENVALUE PROBLEMS. General Tools for Solving Large Eigen-Problems

Review: From problem to parallel algorithm

PETROV-GALERKIN METHODS

Here is an example of a block diagonal matrix with Jordan Blocks on the diagonal: J

Krylov Space Solvers

LARGE SPARSE EIGENVALUE PROBLEMS

On the influence of eigenvalues on Bi-CG residual norms

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.

In order to solve the linear system KL M N when K is nonsymmetric, we can solve the equivalent system

Algebra C Numerical Linear Algebra Sample Exam Problems

JACOBI S ITERATION METHOD

Stabilization and Acceleration of Algebraic Multigrid Method

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Jordan Journal of Mathematics and Statistics (JJMS) 5(3), 2012, pp A NEW ITERATIVE METHOD FOR SOLVING LINEAR SYSTEMS OF EQUATIONS

6. Iterative Methods for Linear Systems. The stepwise approach to the solution...

Simple iteration procedure

IDR(s) as a projection method

Chapter 3 Transformations

Math 5630: Iterative Methods for Systems of Equations Hung Phan, UMass Lowell March 22, 2018

Steady-State Optimization Lecture 1: A Brief Review on Numerical Linear Algebra Methods

Computational Economics and Finance

Algebraic Multigrid as Solvers and as Preconditioner

A short course on: Preconditioned Krylov subspace methods. Yousef Saad University of Minnesota Dept. of Computer Science and Engineering

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Introduction to Scientific Computing

the method of steepest descent

Chap 3. Linear Algebra

Krylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17

COURSE Iterative methods for solving linear systems

Lecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

Some minimization problems

MAT 610: Numerical Linear Algebra. James V. Lambers

Notes on Some Methods for Solving Linear Systems

Iterative Solution methods

arxiv: v4 [math.na] 1 Sep 2018

Splitting Iteration Methods for Positive Definite Linear Systems

Linear Solvers. Andrew Hazel

Math 471 (Numerical methods) Chapter 3 (second half). System of equations

Comparison of Fixed Point Methods and Krylov Subspace Methods Solving Convection-Diffusion Equations

Transcription:

Iterative methods for Linear System JASS 2009 Student: Rishi Patil Advisor: Prof. Thomas Huckle

Outline Basics: Matrices and their properties Eigenvalues, Condition Number Iterative Methods Direct and Indirect Methods Krylov Subspace Methods Ritz Galerkin: CG Minimum Residual Approach : GMRES/MINRES Petrov-Gaelerkin Method: BiCG, QMR, CGS 1st April JASS 2009 2

Basics Linear system of equations Ax = b A Hermitian matrix (or self-adjoint matrix) is a square matrix with complex entries which is equal to its own conjugate transpose, that is, the element in the ith row and jth column is equal to the complex conjugate of the element in the jth row and ith column, for all indices i and j A = 3 2 i 2+ 1 i Symmetric if a ij = a ji Positive definite if, for every nonzero vector x Quadratic form: Gradient of Quadratic form: X T Ax > 0 1 T T f( x) = x Ax b x+ c 2 f( x) x 1 f ' ( x) = = f( x) xn 1 2 A T x + 1 2 Ax b 1st April JASS 2009 3

Various quadratic forms Positive-definite matrix Negative-definite matrix Singular positive-indefinite matrix Indefinite matrix 1st April JASS 2009 4

Various quadratic forms 1st April JASS 2009 5

Eigenvalues and Eigenvectors For any n n matrix A, a scalar λ and a nonzero vector v that satisfy the equation Av=λv are said to be the eigenvalue and the eigenvector of A. If the matrix is symmetric, then the following properties hold: (a) the eigenvalues of A are real (b) eigenvectors associated with distinct eigenvalues are orthogonal The matrix A is positive definite (or positive semidefinite) if and only if all eigenvalues of A are positive (or nonnegative). 1st April JASS 2009 6

Eigenvalues and Eigenvectors Why should we care about the eigenvalues? Iterative methods often depend on applying A to a vector over and over again: (a) If λ <1, then A i v=λ i v vanishes as i approaches infinity (b) If λ >1, then A i v=λ i v will grow to infinity. 1st April JASS 2009 7

Some more terms: Spectral radius of a matrix is: ρ(a)= max λ i Condition number is : K= max min Error: e = x exact x app Residual: r = b-a.x app 1st April JASS 2009 8

Preconditioning Preconditioning is a technique for improving the condition number of a matrix. Suppose that M is a symmetric, positive-definite matrix that approximates A, but is easier to invert. We can solve Ax = b indirectly by solving M -1 Ax = M -1 b 1 Type of preconditioners: Perfect preconditioner M = A Condition number =1 solution in one iteration but Mx=b is not useful preconditioner Diagonal preconditioner, trivial to invert but mediocre Incomplete Cholesky: A LL T Not always stable 1st April JASS 2009 9

Stationary and non-stationary methods Stationary methods for Ax = b: x (k+1) =Rx (k) + c neither R or c depend upon the iteration counter k. Splitting of A A = M - K with nonsingular M Ax= Mx -Kx = b x = M -1 Kx M -1 b = Rx +c Examples: Jacobi method Gauss-Seidel Successive Overrelaxation (SOR) 1st April JASS 2009 10

Jacobi Method Splitting for Jacobi Method, M=D and K=L +U x (k+1) = D -1 ((L+U)x (k) + b) solve for x i from equation i, assuming other entries fixed for i = 1 to n for j = 1 to n u i,j (k+1) = (u i-1,j (k) + u i+1,j (k) + u i,j-1 (k) + u i,j+1 (k) )/4 1st April JASS 2009 11

Gauss-Siedel Method and SOR(Successive-Over-Relaxation) Splitting for Jacobi Method, M=D-L and K=U x (k+1) = (D-L) -1 (U x (k) + b) While looping over the equations, use the most recent values x i for i = 1 to n for j = 1 to n u i,j (k+1) = (u i-1,j (k+1) + u i+1,j (k) + u i,j-1 (k+1) + u i,j+1 (k) )/4 Splitting for SOR: x (k+1) = ωx i (k+1) + (1-ω) x i (k) OR x (k+1) = (D-ωL) -1 (ωu + (1-ω) D) x (k) + ω (D-ωL) -1 b 1st April JASS 2009 12

Stationary and non-stationary methods Non-stationary methods: The constant are computed by taking inner products of residual or other vectors arising from the iterative method Examples: Conjugate gradient (CG) Minimum Residual (MINRES) Generalized Minimal Residual (GMRES) BiConjugate Gradient (BiCG) Quasi Minimal Residual (QMR) Conjugate Gradient Squared (CGS) 1st April JASS 2009 13

Descent Algorithms Fundamental underlying structure for almost all the descent algorithms: Start with an initial point Determine according to a fixed rule a direction of movement Move in that direction to a relative minimum of the objective function At the new point, a new direction is determined and the process is repeated. The difference between different algorithms depends upon the rule by which successive directions of movement are selected 1st April JASS 2009 14

The Method of Steepest Descent In the method of steepest descent, one starts with an arbitrary point x (0) and takes a series of steps x (1), x (2), until we are satisfied that we are close enough to the solution. When taking the step, one chooses the direction in which f decreases most quickly, i.e. Definitions: f ' ( x error vector: e =x -x ) Ax residual: r =b-ax From Ax=b, it follows that r =-Ae =-f (x ) Residual is direction of Steepest Descent = b 1st April JASS 2009 15

The Method of Steepest Descent Starting at (-2,-2) take steps in direction of steepest descent of f The parabola is the intersection of surfaces Find the point of intersection of these surfaces that minimizes f The gradient of the bottomost point is orthogonal to gradient of previous step 1st April JASS 2009 16

The Method of Steepest Descent 1st April JASS 2009 17

The algorithm r α x = = (i+ 1) b r = r T T x Ax r Ar The Method of Steepest Descent +α r => (i+ 1) To avoid one matrix-vector multiplication, one uses r e = e +α (i+ 1) = r α Ar Two matrix-vector multiplications are required. The disadvantage of using this recurrence is that the residual sequence is determined without any feedback from the value of x, so that round-off errors may cause x to converge to some point near x. r 1st April JASS 2009 18

Steepest Descent Problem The gradient at the minimum of a line search is orthogonal to the direction of that search the steepest descent algorithm tends to make right angle turns, taking many steps down a long narrow potential well. Too many steps to get to a simple minimum. 1st April JASS 2009 19

The Method of Conjugate Directions Basic idea: Pick a set of orthogonal search directions d (0), d (1),, d (n-1) Take exactly one step in each search direction to line up with x Solution will be reached in n steps Mathematical formulation: 1. For each step we choose a point x (i+1) =x +α d 2. To find α, we use the fact that e (i+1) is orthogonal to d 1st April JASS 2009 20

The Method of Conjugate Directions To solve the problem of not knowing e, one makes the search directions to be A-orthogonal rather then orthogonal to each other, i.e.: d T ( i) A d ( j) = 0 1st April JASS 2009 21

The Method of Conjugate Directions The new requirement is now that e (i+1) is A-orthogonal to d d T dx(i+ 1) f( x(i+ 1) ) = f '( x(i+ 1) ) = 0 dα dα r d = 0 d d T (i + 1) T T α Ae (i+ 1) ( +α d ) A e = d d T T = 0 r Ad = 0 If the search vectors were the residuals, this formula would be identical to the method of steepest descent. 1st April JASS 2009 22

The Method of Conjugate Directions Calculation of the A-orthogonal search directions by a conjugate Gram-Schmidt process 1. Take a set of linearly independent vectors u 0, u 1,, u n-1 2. Assume that d (0) =u 0 3. For i>0, take an u i and subtracts all the components from it that are not A-orthogonal to the previous search directions d i 1 ( i) = u + βijd( j), j= 0 β ij = u d T T (j) Ad Ad ( j) ( j) 1st April JASS 2009 23

The Method of Conjugate Directions The method of Conjugate Gradients is simply the method of conjugate directions where the search directions are constructed by conjugation of the residuals, i.e. u i =r This allows us to simplify the calculation of the new search direction because T T 1 r r r r = i= j+ 1 β = α T T ij (i 1) d (i 1) Ad(i 1) r (i 1) r(i 1) 0 i> j+ 1 The new search direction is determined as a linear combination of the previous search direction and the new residual d (i+ 1) = r(i+ 1) + βid 1st April JASS 2009 24

The Method of Conjugate Directions x 0 = 0, r 0 = b, d 0 = r 0 for k = 1, 2, 3,... α k = (r T k-1 r k-1 ) / (dt k-1 Ad k-1 ) step length x k = x k-1 + α k d k-1 approx solution r k = r k-1 α k Ad k-1 residual β k = (r T k r k ) / (r T k-1r k-1 ) improvement d k = r k + β k d k-1 search direction One matrix-vector multiplication per iteration Two vector dot products per iteration Four n-vectors of working storage 1st April JASS 2009 25

Krylov subspace Krylov subspace К j is the linear combinations of b, Ab,...,A j 1 b. Krylov matrix Kj =[ b Ab A2b... Aj 1b ]. Methods to construct a basis for К j : Arnoldi's method and Lanczos method Approaches to choosing a good x j in К j : Ritz-Galerkin approach: r j =b -Ax j is orthogonal to К j (Conjugate Gradient) Minimum Residual approach r j has minimum norm for x j in К j (GMRES and MINRES) Petrov-Galerkin approach: r j is orthogonal to a different space К j (A T ) (Biconjugate Gradient) 1st April JASS 2009 26

Arnoldi s Method The best basis q 1,...,q j for the Krylov subspace К j is orthonormal. Each new q j comes from orthogonalizing t = Aq j 1 to the basis vectors q 1,...,q j that are already chosen. The iteration to compute these orthonormal q s is Arnoldi s method. q 1 = b / b % Normalize b to q 1 = 1 for j = 1,...,n 1 % Start computation of q j+1 t = Aq j % one matrix multiplication for i = 1,...,j % t is in the space K j+1 h ij = q T i % h ij q T =projection of t on q i i t = t - h ij q i % Subtract that projection end; % t is orthogonal to q 1,...,q j h = t j+1,j % Compute the length of t end q j+1 = t / h j+1,j % Normalize t to q j+1 =1 %q 1,... q n are orthnormal AQ n-1 = Q n H H n,n-1 is upper Hessenberg matrix n,n-1 1st April JASS 2009 27

Lanczos Method Lanczos method is specialized Arnoldi iteration, if A is symmetric (real) H n-1,n-1 = Q T n-1 A Q n-1 H n-1,n-1 is tridiagonal and this means that in the orthogonalization process, each new vector has to be orthogonalized with respect to the previous two vectors only,since the inner products vanish. Β 0 =0, q 0 =0, b= arbitrary, q 1 =b / b for i = 1,...,n 1 end v = Aq j α i = q T i v v = v Β i-1 q i-1 - α i q Β i = v q j+1 = v / Β i 1st April JASS 2009 28

Minimum Residual Methods Problem: If A is not symmetric positive definite, CG is not guaranteed to solve Ax=b. Solution: Minimum Residual Methods. Choose x j in the Krylov subspace К j so that b - Ax j is minimal The first orthonormal vectors q 1,...,q j go in the columns Q j so Q j T Q j = I Setting x j = Q j y r j = b - Ax j = b AQ j y = b Q j+1 H j+,1,j y Using first j columns of Arnoldi's formula AQ = QH First j columns of QH = 1st April JASS 2009 29

Minimum Residual Methods The problem becomes: Choose y to minimize r j = Q T j+1 b H j+,1,j y This is least squares problem. Using zeros in H and Q t b to find a fast algorithm that computes y. j+1 GMRES (Generalised Minimal Residual Approach) A is not symmetric and the upper triangular part of H can be full. All previously computed vectors have to be stored. MINRES:(Minimal Residual Approach) A is symmetric (likely indefinite) and H is tridiagonal. Avoids storageof all basis vectors for the Krylov subspace Aim: to clear out the non-zero diagonal below the main diagonal of H. This is done by Givens rotations 1st April JASS 2009 30

GMRES Algorithm: GMRES q 1 = b / b for j = 1, 2, 3... step j of Arnoldi iteration Find y to minimize r j = Q T j+1 b H j+,1,j y x j = Q j y Full-GMRES : The upper triangle in H can be full and step j becomes expensive and possibly it is inaccurate as j increases. GMRES(m): Restarts the GMRES algorithm every m steps However tricky to choose m. 1st April JASS 2009 31

Petrov-Galerkin approach r j is orthogonal to a different space К j (A T ) BiCG (Bi-Conjugate Gradient) QMR (Quasi Minimum Residual) CGS (Conjugate Gradient Squared) 1st April JASS 2009 32

Lanczos Bi-Orthogonalization Procedure Extension of the symmetric Lanczos algorithm Builds a pair of bi-orthogonal bases for the two subspaces K m (A, v 1 ) and K m (A T,w 1 ) 1st April JASS 2009 33

Bi-Conjugate Gradient (BiCG) 1st April JASS 2009 34

Quasi Minimum Residual (QMR) QMR uses unsymmetric Lanczos algorithm to generate a basis for the Krylov subspaces The lookahead technique avoids breakdowns during Lanczos process and makes QMR robust. 1st April JASS 2009 35

Conjugate Gradient Squared (CGS) 1st April JASS 2009 36

Summary Stationary Iterative Solvers : Jacobi, Gauss-Seidel, SOR Non-Stationary Solvers: Krylov subspace methods Conjugate Gradient Symmetric postive definite systems GMRES and MINRES Non-symmetric matrices, but expensive Bi-CG Non-symmetric, two matrix-vector product QMR Non-symmetric, avoids irregular convergence of BiCG CGS Non-symmetric, faster than BICG, does not require transpose 1st April JASS 2009 37

References An Introduction to the Conjugate Gradient Method Without the Agonizing Pain Jonathan Richard Shewchuk August 4, 1994 Closer to the solution: Iterative linear solvers Gene h. Golub, Henk A. Van der Vorst Krylov subspaces and Conjugate Gradient- Gilbert Strang 1st April JASS 2009 38

Thank You! 1st April JASS 2009 39