Lecture 11. Fast Linear Solvers: Iterative Methods. J. Chaudhry. Department of Mathematics and Statistics University of New Mexico

Similar documents
4.6 Iterative Solvers for Linear Systems

Numerical Analysis: Solutions of System of. Linear Equation. Natasha S. Sharma, PhD

7.3 The Jacobi and Gauss-Siedel Iterative Techniques. Problem: To solve Ax = b for A R n n. Methodology: Iteratively approximate solution x. No GEPP.

7.2 Steepest Descent and Preconditioning

Chapter 7 Iterative Techniques in Matrix Algebra

Solving Linear Systems

Today s class. Linear Algebraic Equations LU Decomposition. Numerical Methods, Fall 2011 Lecture 8. Prof. Jinbo Bi CSE, UConn

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.

Lecture 9. Errors in solving Linear Systems. J. Chaudhry (Zeb) Department of Mathematics and Statistics University of New Mexico

Solving Linear Systems

JACOBI S ITERATION METHOD

Lab 1: Iterative Methods for Solving Linear Systems

Numerical Methods - Numerical Linear Algebra

Conjugate Gradient (CG) Method

9.1 Preconditioned Krylov Subspace Methods

Iterative Methods. Splitting Methods

CS 323: Numerical Analysis and Computing

3.2 Iterative Solution Methods for Solving Linear

Lecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Algebra C Numerical Linear Algebra Sample Exam Problems

CAAM 454/554: Stationary Iterative Methods

Preconditioning Techniques Analysis for CG Method

Poisson Equation in 2D

Introduction to Scientific Computing

The Conjugate Gradient Method

Numerical Methods I Non-Square and Sparse Linear Systems

COURSE Numerical methods for solving linear systems. Practical solving of many problems eventually leads to solving linear systems.

Lecture # 20 The Preconditioned Conjugate Gradient Method

Math/Phys/Engr 428, Math 529/Phys 528 Numerical Methods - Summer Homework 3 Due: Tuesday, July 3, 2018

1 Non-negative Matrix Factorization (NMF)

Iterative techniques in matrix algebra

Review of matrices. Let m, n IN. A rectangle of numbers written like A =

TMA4125 Matematikk 4N Spring 2017

Notes on PCG for Sparse Linear Systems

DEN: Linear algebra numerical view (GEM: Gauss elimination method for reducing a full rank matrix to upper-triangular

From Stationary Methods to Krylov Subspaces

Gauss-Seidel method. Dr. Motilal Panigrahi. Dr. Motilal Panigrahi, Nirma University

Notes for CS542G (Iterative Solvers for Linear Systems)

Iterative Methods for Solving A x = b

Iterative Methods and Multigrid

A Method for Constructing Diagonally Dominant Preconditioners based on Jacobi Rotations

CS137 Introduction to Scientific Computing Winter Quarter 2004 Solutions to Homework #3

The conjugate gradient method

Next topics: Solving systems of linear equations

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

Introduction to Iterative Solvers of Linear Systems

6. Iterative Methods for Linear Systems. The stepwise approach to the solution...

Lecture 17 Methods for System of Linear Equations: Part 2. Songting Luo. Department of Mathematics Iowa State University

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

6.4 Krylov Subspaces and Conjugate Gradients

CLASSICAL ITERATIVE METHODS

Chapter 2. Solving Systems of Equations. 2.1 Gaussian elimination

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Numerical Solution Techniques in Mechanical and Aerospace Engineering

Process Model Formulation and Solution, 3E4

Scientific Computing

9. Iterative Methods for Large Linear Systems

PETROV-GALERKIN METHODS

Some definitions. Math 1080: Numerical Linear Algebra Chapter 5, Solving Ax = b by Optimization. A-inner product. Important facts

The Solution of Linear Systems AX = B

Solving Sparse Linear Systems: Iterative methods

Solving Sparse Linear Systems: Iterative methods

Solving linear equations with Gaussian Elimination (I)

A LINEAR SYSTEMS OF EQUATIONS. By : Dewi Rachmatin

MA3232 Numerical Analysis Week 9. James Cooley (1926-)

Iterative methods for Linear System

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Applied Mathematics 205. Unit I: Data Fitting. Lecturer: Dr. David Knezevic

Lecture 9 Approximations of Laplace s Equation, Finite Element Method. Mathématiques appliquées (MATH0504-1) B. Dewals, C.

Chapter 7. Iterative methods for large sparse linear systems. 7.1 Sparse matrix algebra. Large sparse matrices

PowerPoints organized by Dr. Michael R. Gustafson II, Duke University

Computational Methods. Systems of Linear Equations

SyDe312 (Winter 2005) Unit 1 - Solutions (continued)

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

Review: From problem to parallel algorithm

Solving Linear Systems of Equations

Linear Algebraic Equations

Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright.

Scientific Computing WS 2018/2019. Lecture 9. Jürgen Fuhrmann Lecture 9 Slide 1

Goal: to construct some general-purpose algorithms for solving systems of linear Equations

FEM and Sparse Linear System Solving

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

Math 5630: Iterative Methods for Systems of Equations Hung Phan, UMass Lowell March 22, 2018

1 Conjugate gradients

Since the determinant of a diagonal matrix is the product of its diagonal elements it is trivial to see that det(a) = α 2. = max. A 1 x.

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.

ECE580 Partial Solution to Problem Set 3

Computational Economics and Finance

Theory of Iterative Methods

Conjugate Gradient algorithm. Storage: fixed, independent of number of steps.

Lecture 10b: iterative and direct linear solvers

Numerical Linear Algebra And Its Applications

Classical iterative methods for linear systems

COURSE Iterative methods for solving linear systems

Lecture 18 Classical Iterative Methods

Transcription:

Lecture 11 Fast Linear Solvers: Iterative Methods J. Chaudhry Department of Mathematics and Statistics University of New Mexico J. Chaudhry (UNM) Math/CS 375 1 / 23

Summary: Complexity of Linear Solves Ax = b diagonal system: O(n) upper or lower triangular system: O(n 2 ) full system with GE or Cholesky: O(n 3 ) partial pivoting adds O(n 2 ) full system with LU: O(n 3 ) LU back solve: O(n 2 ) m different right-hand sides: O(mn 3 ) or O(n 3 + mn 2 ) tridiagonal system: O(n) m-band system: O(m 2 n) J. Chaudhry (UNM) Math/CS 375 2 / 23

Approximate solutions So far, we are seeking exact solutions to Ax = b What if we only need an approximations ˆx to x? We would like some ˆx so that ˆx x ɛ, where ɛ is some tolerance. J. Chaudhry (UNM) Math/CS 375 3 / 23

The Residual We can t actually evaluate But... e = x ˆx For true solution b Ax 0 For approximate solution ˆx b Aˆx 0 We recall that ˆr = b Aˆx is the residual. It is way to measure the error. In fact ˆr = b Aˆx = Ax Aˆx = Aê J. Chaudhry (UNM) Math/CS 375 4 / 23

How big is the residual? For a given approximation, ˆx to x, how big is the residual ˆr = b Aˆx? r gives a magnitude r 1 = n j=1 r i r 2 = ( n j=1 r2 i ) 1/2 r = max 1 j n r i J. Chaudhry (UNM) Math/CS 375 5 / 23

Approximating x... Suppose we made a wild guess to the solution x of Ax = b: x (0) x How do I improve x (0)? Ideally: x (1) = x (0) + e (0) but to obtain e (0), we must know x. Not a viable method. Ideally (another way): x (1) = x (0) + e (0) = x (0) + (x x (0) ) = x (0) + (A 1 b x (0) ) = x (0) + A 1 (b Ax (0) ) = x (0) + A 1 r (0) J. Chaudhry (UNM) Math/CS 375 6 / 23

An iteration Again, the method is nonsense since A 1 is needed. x (1) = x (0) + A 1 r (0) What if we approximate A 1? Suppose Q 1 A 1 and is cheap to construct, then x (1) = x (0) + Q 1 r (0) is a good step. continuing... x (k) = x (k 1) + Q 1 r (k 1) J. Chaudhry (UNM) Math/CS 375 7 / 23

What kind of Q 1 do I need? Rewrite: x (k) = x (k 1) + Q 1 (b Ax (k 1) ) This becomes Qx (k) = Qx (k 1) + (b Ax (k 1) ) = (Q A)x (k 1) + b This is the form in the text (page 408 in NMC7 or 322 in NMC6). Or x (k) = Q 1 (Q A)x (k 1) + Q 1 b J. Chaudhry (UNM) Math/CS 375 8 / 23

Two Popular Choices Example Jacobi iteration approximates A with Q = diag(a). 1 x = x (0) 2 3 Q = D 4 5 for k = 1 to k max 6 r = b Ax 7 if r = b Ax tol, stop 8 9 x = x + Q 1 r 0 end (Do not implement the algorithm this way! More on implementation later.) J. Chaudhry (UNM) Math/CS 375 9 / 23

Two Popular Choices Let A = D C U C L, where C U and C L are strictly upper and lower triangular respectively. Example Gauss-Seidel iteration approximates A with Q = lowertri(a). 1 x = x (0) 2 3 Q = D C L 4 5 for k = 1 to k max 6 r = b Ax 7 if r = b Ax tol, stop 8 9 x = x + Q 1 r 0 end (Do not implement the algorithm this way! More on implementation later.) J. Chaudhry (UNM) Math/CS 375 10 / 23

Example: Lets do Jacobi and Gauss-Seidel: Example Consider 2 1 0 1 0 A = 1 3 1, b = 8, x (0) = 0 0 1 2 5 0 J. Chaudhry (UNM) Math/CS 375 11 / 23

Why D and D C L? Look again at the iteration x (k) = x (k 1) + Q 1 r (k 1) Looking at the error: x x (k) = x x (k 1) Q 1 r (k 1) Gives or or e (k) = e (k 1) Q 1 Ae (k 1) e (k) = (I Q 1 A)e (k 1) e (k) = (I Q 1 A) k e (0) J. Chaudhry (UNM) Math/CS 375 12 / 23

Why D and D C L? We want e (k) = (I Q 1 A) k e (0) to converge. When does a k = c k converge?...when c < 1 Likewise, our iteration converges e (k) = (I Q 1 A) k e (0) I Q 1 A k e (0) when I Q 1 A < 1. J. Chaudhry (UNM) Math/CS 375 13 / 23

Matrix Norms (Recall from earlier lecture) What is I Q 1 A? A 1 = max n 1 j n i=1 a ij A 2 = ρ(a T A)??? ρ(a) = max 1 j n λ i??? A 2 = ρ(a) for symmetric A??? A = max n 1 i n j=1 a ij J. Chaudhry (UNM) Math/CS 375 14 / 23

Again, why do Jacobi and Gauss-Seidel work? Jacobi, Gauss-Seidel (sufficient) Convergence Theorem If A is diagonally dominant, then the Jacobi and Gauss-Seidel methods converge for any initial guess x (0). Definition: Diagonal Dominance A matrix is diagonally dominant if n a ii > a ij for all i. j=1,j i J. Chaudhry (UNM) Math/CS 375 15 / 23

Implementation of Jacobi Algorithm (or the Smart Jacobi Algorithm) The algorithm above uses the matrix representation: x (k) = D 1 (C L + C U )x (k 1) + D 1 b Or componentwise, x (k) i = n j=1,j i ( aij a ii ) x (k 1) j + b i a ii So each sweep (from k 1 to k) uses O(n) operations per vector element, or a total of O(n 2 ) operations for n vector elements. If, for each row i, a ij = 0 for all but m values of j, each sweep uses O(mn) operations. J. Chaudhry (UNM) Math/CS 375 16 / 23

Implementation of Gauss-Seidel Algorithm (or the Smart Gauss-Seidel Algorithm) The algorithm above uses the matrix representation: Component-wise: x (k) i x (k) = (D C L ) 1 C U x (k 1) + (D C L ) 1 b = n j=1,j<i ( aij a ii ) x (k) j n j=1,j>i ( aij a ii ) x (k 1) j + b i a ii So again each sweep (from k 1 to k) uses O(n) operations per vector element, or a total of O(n 2 ) operations for n vector elements. If, for each row i, a ij = 0 for all but m values of j, each sweep uses O(mn) operations. The difference is that in the Jacobi method, updates are saved (and not used) in a new vector. With Gauss-Seidel, an update to an element x (k) i is used immediately. J. Chaudhry (UNM) Math/CS 375 17 / 23

Good Exercise: Redo the earlier example with Smart Jacobi and Smart Gauss-Seidel: Example Consider 2 1 0 1 0 A = 1 3 1, b = 8, x (0) = 0 0 1 2 5 0 J. Chaudhry (UNM) Math/CS 375 18 / 23

Conjugate Gradients Suppose that A is n n symmetric and positive definite. Since A is positive definite, x T Ax > 0 for all x R n. Define a quadratic function φ(x) = 1 2 xt Ax x T b It turns out that φ = b Ax = r, or φ(x) has a minimum for x such that Ax = b. Optimization methods look in a search direction and pick the best step: x (k+1) = x (k) + αs (k) Choose α so that φ(x (k) + αs (k) ) is minimized in the direction of s (k). 0 = d dα φ(x(k+1) ) = φ(x (k+1) ) T d dα x(k+1) = ( r (k+1) ) T d dα (x(k) + αs (k) ) = ( r (k+1) ) T s (k). J. Chaudhry (UNM) Math/CS 375 19 / 23

Conjugate Gradients Find α so that φ is minimized: We also know 0 = (r (k+1) ) T s (k). r (k+1) = b Ax (k+1) = b A(x (k) + αs (k) ) = r (k) αas (k) So, the optimal search parameter is α = (r(k) ) T s (k) (s (k) ) T As (k) This is CG: take a step in a search direction J. Chaudhry (UNM) Math/CS 375 20 / 23

Congugate Gradients Neat trick: We can compute the r without explicitly forming b Ax: r (k+1) = b Ax (k+1) = b A(x (k) +αs (k) ) = b Ax (k) αas (k) = r (k) αas (k) J. Chaudhry (UNM) Math/CS 375 21 / 23

Congugate Gradients How should we pick s (k)? Note that φ = b Ax = r, so r is the gradient of φ (for any x), and this is a good direction. Thus, pick s (0) = r = b Ax (0). What is s (1)? This should be in the direction of r (1), but conjugate to s (0) : (s (1) ) T As (0) = 0. (Two vectors u and v are A-conjugate if u T Av = 0) So, if we let s (1) = r (1) + βs (0), we can require or 0 = (s (1) ) T As (0) = ((r (1) ) T + β(s (0) ) T )As (0) = (r (1) ) T As (0) + β(s (0) ) T As (0) β = (r (1) ) T As (0) /(s (0) ) T As (0). Holds for s (k+1) in terms of r (k+1) + β (k+1) s (k) Further similification (which is not simple to carry out) yields a simple method that requires only one matrix-vector product per step: J. Chaudhry (UNM) Math/CS 375 22 / 23

Conjugate Gradients 1 x (0) = initial guess 2 r (0) = b Ax (0) 3 s (0) = r (0) 4 for k = 0, 1, 2,... 5 α (k) = (r(k) ) T r (k) (s (k) ) T As (k) 6 x (k+1) = x (k) + α (k) s (k) 7 r (k+1) = r (k) α (k) As (k) 8 β (k+1) = (r (k+1) ) T r (k+1) /(r (k) ) T r (k) 9 s (k+1) = r (k+1) + β (k+1) s (k) J. Chaudhry (UNM) Math/CS 375 23 / 23