7.2 Steepest Descent and Preconditioning
|
|
- Lillian Theresa Mason
- 6 years ago
- Views:
Transcription
1 7.2 Steepest Descent and Preconditioning Descent methods are a broad class of iterative methods for finding solutions of the linear system Ax = b for symmetric positive definite matrix A R n n. Consider the functional J : R n R given by J(y) = 1 2 yt Ay y T b Theorem: Let A R n n be symmetric positive definite and J be defined as above. Then there is exactly one x R n for which J(x) = min J(y) y and that x is the solution to the linear system Ax = b. Proof I: Since x is the solution to Ax = b we can write the functional as J(y) = 1 2 yt Ay y T Ax Then, by completing the square we have J(y) = 1 2 yt Ay y T Ax = 1 2 yt Ay y T Ax xt Ax 1 2 xt Ax = 1 2 yt Ay 1 2 xt Ay 1 2 yt Ax xt Ax 1 2 xt Ax = 1 2 (y x)t Ay 1 2 (y x)t Ax 1 2 xt Ax = 1 2 (y x)t A (y x) 1 2 xt Ax We re minimizing over y so x is fixed. Since A is positive definite we know that the first term in the rewritten functional will always be positive. This means that the best we can do to minimize J is to choose y so that the first term is 0. This is accomplished precisely when we choose y = x. Thus, the solution to Ax = b is the minimizer of functional J. APPM 4650 Chris Ketelsen December 16, 2013 Section 7.2
2 How does this help us come up with a new solver for Ax = b? It implies that instead of solving the system Ax = b directly, we can solve the minimization problem that minimizes J. Proof II: Let y = [y 1, y 2,..., y n ] T and recall that J = routine calculation shows that [ J, J,..., J ] T. Then, a y 1 y 2 y n J = Ay b (The computation is left as an exercise.) Notice that J evaluated at any vector y is just the negative of the residual vector r = b Ay that arises from using y as an approximate solution to Ax = b. We know from calculus that the extreme values of scalar functions like J occur when J = 0. Since A is s.p.d. it is also nonsingular, which means the sole critical value of J occurs when Ay b = 0 Ax = b. Since functional J is quadratic and the matrix involved in the leading term is s.p.d. we can conclude that the only extreme value of J is a minimum. Thus we have again shown that the unique minimum of functional J is the solution to the linear system Ax = b. Descent Methods: Descent methods are iterative methods that attempt to desc to the minimum of functional J (i.e. the solution of Ax = b). Descent methods start with initial guess x (0) and generate a sequence of iterates x (1), x (2), x (3),... such that each set of consecutive iterates satisfy J ( x (k+1)) J ( x (k)) but hopefully J ( x (k+1)) < J ( x (k)) That is to say, each new iterate should lower the value of the functional and thus get closer to the solution of Ax = b. If at some point we have Ax (k) = b, or nearly so, we stop and consider x (k) our approximate solution. If x (k) is still not a good enough solution to Ax = b we take another step in the iteration which decreases the functional. The question is: How do we go from x (k) to x (k+1)? The process has two components: 1. Choose a direction to move in. (We ll call this the search direction.) 2. Choose how far to move in the decided upon direction.
3 This can be stated more mathematically: 1. Choose search direction vector p (k) to move in on the k th iteration 2. Choose step parameter α k R determining how far in the direction of p (k) to move This produces the update step x (k+1) = x (k) + α k p (k) Notice that this indicates that the new iterate x (k+1) lies along a line starting from x (k) and heading in the search direction p (k). Those choice of α k determines how far along this line we move. Intuitively, since we re solving a minimization problem, we d like to choose α k so that the move reduces J as much as possible. When we move in such a way, to reduce the functional, the movement is called a line search. Given a direction p (k) for a line search there are a lot of choices for how far we actually move. If we choose α k exactly such that the move minimizes the functional J along that line, the process is called an exact line search. If, for whatever reason, we do not pick α k to explicitly minimize J, the process is called an inexact line search. Suppose that given x (k) we have (somehow) decided on a direction p (k) to search in. For an arbitrary functional, it is usually very difficult to find an α k so that the movement to x (k+1) exactly minimizes the functional, but since in our case J is a quadratic it s actually pretty easy. Exact Line Search: For a given iterate x (k) and search direction p (k), define g(α) = J ( x (k) + αp (k)) Then the minimizer of g(α) is α k, the correct value for our exact line search. To find the minimizer we set g (α) = 0 and solve for α. Notice that g(α) = J ( x (k) + αp (k)) = 1 ( x (k) + αp (k)) T ( A x (k) + αp (k)) ( x (k) + αp (k)) T b ( 2 ) 1 ( ) = 2 x(k)t Ax (k) x (k)t b + αp (k)t Ax (k) αp (k)t b α2 p (k)t Ap (k) = J ( x (k)) αp (k)t Ar (k) α2 p (k)t Ap (k) Then, taking the derivative of g w.r.t. α, setting it equal 0, and solving for α we have r(k)t p (k) α = p (k)t Ap =: α (k) k
4 So, given a search direction p (k), the choice of the α k above results in a step that minimizes J along p (k). Now we just need to choose the search direction. Notice that the only way to get α k = 0 is to choose a search direction p (k) that is orthogonal to r (k). Since this would result in no movement at all we would like to avoid this. Choice of Search Direction: There are many ways to choose the search direction. One instructive (but naive) method is to choose the search direction that gives the maximum decrease in the functional J. Recall from vector calculus that the maximum decrease of a scalar functional is in the direction of the negative gradient. Therefore we propose choosing p (k) = J ( x (k)) = b Ax (k) = r (k) This indicates that the choice of the residual as the search direction will result in the greatest possible decrease of the functional. This leads to the so-called Method of Steepest Descent Function SteepestDescent () Input: Matrix A Righthand Side Vector b Initial guess x (0) r b Ax p r while not yet converged do α p T r/p T Ap x x + αp r b Ax p r Notice that the most expensive operation in the algorithm are the two matrix-vector multiplies. We can reduce their cost as follows. Notice that r (k+1) = b Ax (k+1) = b A ( x (k) + αp (k)) = r (k) αap (k) Now notice that the only two mat-vecs in the algorithm are both of the form Ap. Since mat-vecs are (relatively) expensive, we might as well only do this operation once and store it for later use. The modified algorithm then becomes
5 Function SteepestDescent () Input: Matrix A Righthand Side Vector b Initial guess x (0) r b Ax p r while not yet converged do q Ap α p T r/p T q x x + αp r r αq p r Geometric Interpretation of Steepest Descent Recall that in the Steepest Descent Method, we re attempting to minimize the functional J(y) = 1 2 (y x)t A (y x) 1 2 xt Ax by performing, at each iteration, an exact line search in the direction of the negative residual vector. Geometrically you can think of this as moving from your current iterate in a direction orthogonal to the contour, and continuing until you reach a point parallel to another contour. The process continuous until you are reasonably close to the minimum of the functional. Notice that if the shape of the contours is close to circular then the Steepest Descent Method reaches a minimum fairly quickly. If, on the other hand, the contours are fairly elongated, it can take a large number of iterations to reach the minimum, and Steepest Descent will be very inefficient.
6 Let s analyze the case when convergence of Steepest Descent is very slow. We know that the functional J that we re trying to minimize is a quadratic. To get a better idea of the contours of the functional, we transform J into a new coordinate system. Since A is symmetric it has the following eigen-decomposition: A = UΛU T Where Λ is a diagonal matrix made up of the eigenvalues of A and U is an orthogonal matrix whose columns are the eigenvectors of A. Substituting this into the functional (and dropping the constant term and scaling factor since they don t affect the minimization) we have J(y) = (y x) T UΛU T (y x) = [ U T (y x) ] T Λ [ U T (y x) ] We then define a new coordinate system by z = U T (y x). Since U is an orthogonal transformation the contours of the functional are simply rotated. We now have our simplified functional J(z) = z T Λz = n λ i zi 2 i=1 Let s think about this in two-dimensions because we can actually visualize it. For n = 2 we have J(z) = z T Λz = λ 1 z λ 2 z 2 2 The level curves of the functional look like λ 1 z λ 2 z 2 2 = c which form a family of concentric ellipses, with the solution of Ax = b located at the center. We can rewrite the equation of the elliptical contours as λ 1 c z2 1 + λ 2 c z2 2 = 1 which, if we assume that λ 2 > λ 1, indicates that the minor and major axes of the ellipse have lengths λ 1 /c and λ 2 /c, respectively. We can get an idea of how elongated the ellipses are by taking the ratio of the lengths of the two axes. This ratio is given by λ2 /c λ1 /c = λ 2 /λ 1
7 Notice that since A is symmetric we have A 2 = ρ(a) = max λ λ and A 1 2 = ρ ( A 1) = 1/ min λ λ which for this 2 2 case gives λ2 /λ 1 = λ 2 1 λ 1 = A 2 A 1 2 = κ 2 (A) This indicates that the contours of the functional J will be very elongated if A is illconditioned (i.e. has a large condition number) and thus Steepest Descent will take a long time to converge. Example: Consider solving the linear system Ax = b where A = [ ] b = [ 30 6 ] x = [ 1 5 ] The eigenvalues of A are approximately λ 1 = 0.9 and λ 2 = 25.1 which gives a condition number of approximately κ 2 (A) = This may not seem too bad, but it means that the ratio of the axes of the each contour ellipse is about 26.1 = 5.1. The matrix does not seem particularly ill-conditioned, but it results in fairly elongated contour ellipses. So, we might expect Steepest Descent to converge slowly here. In fact, if we iterate until the relative residual r / b is less than then it takes 43 iterations of Steepest Descent. 43 iterations for a 2 2 system! Can we do better than this? Preconditioning Strategy: Transform the linear system into an equivalent problem that is better conditioned, and then apply Steepest Descent to the new problem. We transform the linear system by applying a symmetric approximate inverse of A to the linear system. There are many choices for this approximate inverse, but one common choice is to use the M from our matrix splitting methods (e.g. Jacobi, Symmetric Gauss-Seidel, etc). Suppose we choose symmetric matrix M that is a good approximation to A. We d then have Ax = b M 1 Ax = M 1 b Âx = ˆb The general idea is that we d now apply our Steepest Descent method to the modified linear system Âx = ˆb. Unfortunately, the way we ve done it, Â is no longer symmetric, even if M
8 is, so while we could probably apply it as a preconditioner for Steepest Descent, we wouldn t be able to use it with CG. Instead we consider the following: Choose a symmetric positive definite M and compute it s Cholesky Decomposition M = R T R. Note that we have M 1 = R 1 R T. Multiplying through by R T we have Ax = b R T Ax = R T b R T AR 1 Rx ˆx = ˆb where  = R T AR 1 ˆx = Rx ˆb = R T b ˆr = ˆb ˆx = R T r Notice that  is symmetric (and positive definite) so we can apply Steepest Descent (or CG) to the system ˆx = ˆb. Then, once we ve reached a good enough solution for ˆx we get the solution to our original problem by x = R 1ˆx. Example: Consider the same example problem as before. One very popular (but not always that effective) choice of preconditioner is the matrix with just the main diagonal of A. This is called the Diagonal Preconditioner in most literature. (We will show later that this is equivalent to using one step of the Jacobi iteration with a zero initial guess). So for this problem we d take M = [ ] = R T R = [ ] [ ] Which gives  = [ 1/ ] [ ] [ 1/ ] [ = 1 1/5 1/5 1 ] The condition number of  is given by κ2(â) = 1.5 which means that the ratio of the major and minor axes of the contour ellipses is , which is must better than 5. In fact, if we apply Steepest Descent to the preconditioned problem we converge to a relative residual under in just 15 iterations.
9 Implementation So far the preconditioning methodology that we ve come up with requires us to do the Cholesky Decomposition of M, transform the linear system, and then apply the Steepest Descent Method. This makes for a pretty large overhead. It would be nice if we could skip a lot of this stuff, and it turns out we can. Consider applying Steepest Descent to the modified linear system algorithm looks like ˆx = ˆb. The general ˆr ˆb ˆx ˆp ˆr while not yet converged do ˆq ˆp α ˆp Tˆr/ˆp T ˆq ˆx ˆx + αˆp ˆr ˆr αˆq ˆp ˆr We know the explicit transformation for x, b, and A. We need to define them for vectors p and q as well. Since ˆp gets added to ˆx it makes sense to define the transformation from p to ˆp in an analagous way with x. Then ˆx = Rx we define ˆp = Rp Similarly, since ˆq appears with ˆr: ˆr = R T r we define ˆq = R T q Now, we can write the steps in the Steepest Descent algorithms involving the hat vectors in terms of their transformations of the original vectors and we see that a lot of the tranformations cancel out. For instance ˆx ˆx + αˆp Rx Rx + αrp x x + αp Similarly for the residual update we have ˆr ˆr αˆq R T r R T r αr T q r r αq
10 In fact, the transformation drops out for almost every step in the Steepest Descent Algorithm. We have r b Ax p? while not yet converged q Ap α p T r/p T q x x + αp r r αq p? do Notice, the only step in the algorithm that needs to be changed is the updating of the search direction. This happens because the transformations of p and r are not the same (i.e. one involves R and the other R T ). But, it s still pretty easy to figure out ˆp ˆr Rp R T r p R 1 R T r = M 1 r So the final algorithm becomes Function PreconditionedSteepestDescent () Input: Matrix A Preconditioner M Righthand Side Vector b Initial guess x (0) r b Ax p M 1 r while not yet converged do q Ap α p T r/p T q x x + αp r r αq p M 1 r So the only thing that changes in the preconditioned descent algorithm is that instead of choosing the residual as the search direction, we apply M 1 to the residual and use that instead. Notice that this means that we don t actually have to compute the factorization of M, we can just apply its inverse directly.
11 Examples of Preconditioners Jacobi and the Diagonal Preconditioner: The preconditioning step in the algorithm occurs when we set the new search direction equal to M 1 times the residual. In other words, p M 1 r If we re using the Diagonal Preconditioner this just means we use p D 1 r. We can also show that this is equivalent to applying one iteration of Jacobi Iteration to the system Ap = r using a zero initial guess. Recall, the general form of an iteration of Jacobi applied to the linear system Ap = r is given by p (1) D 1 (L + U) p (0) + D 1 r If we choose the zero initial guess p (0) = 0 then this reduces to p D 1 r which is exactly the same as the Diagonal Preconditioner. Symmetric Gauss-Seidel: In order to use Gauss-Seidel and keep the preconditioned system symmetric, we have to use a modified form of Gauss-Seidel known as Symmetric Gauss-Seidel. In order to derive the method we have to think about the efficient implementation of the standard Gauss-Seidel iteration. Recall that in terms of a matrix splitting, one Gauss-Seidel iteration applied to the linear system Ap = r is given by p (1) (D L) 1 Up (0) + (D L) 1 r Those of you who chose to implement Gauss-Seidel for the homework using loops instead of matrices will recognize one iteration of Gauss-Seidel as for i = 1 to n do ( q (k+1) i 1 i 1 r i a ij p (k+1) j a ii j=1 n j=i+1 a ij p (k) j ) Now, the trick to symmetrizing Gauss-Seidel is to do two iterations of Gauss-Seidel, but loop over the variables in different orders. You do the first iteration as written above, and then for the second iteration you start at the bottom of the vector and loop backwards up to the top. In other words you do
12 for i = n to 1 do ( q (k+1) i 1 i 1 r i a ij p (k+1) j a ii j=1 n j=i+1 a ij p (k) j ) It is easy to check that Gauss-Seidel with a backwards loop can be written as a matrix splitting iteration, with the form p (1) (D U) 1 Lp (0) + (D U) 1 r Because of the ordering of the loops, the traditional Gauss-Seidel method we discussed in class is sometimes called Forward Gauss-Seidel (FGS) and the iteration with the reverse loop is called Backward Gauss-Seidel (BGS). One iteration of Forward Gauss-Seidel, followed by an iteration of Backward Gauss-Seidel, is considered one iteration of Symmetric Gauss- Seidel (SGS). Note: If you re purely interested in being able to implement a Symmeteric Gauss-Seidel preconditioner, then you can stop at this point. The jist of the method is that you apply the preconditioner by running one iteration of SGS on Ap = r with a zero initial guess. If, however, you want to see how this thing could possibly be symmetric, then continue reading. Be warned that it gets a little algebraee. Now, let s think about applying one iteration of Symmetric Gauss-Seidel with a zero initial guess for doing the preconditioning in steepest descent. In other words we will do one iteration of SGS applied to the linear system Ap = r with p (0) = 0. p (1) (D L) 1 Up (0) + (D L) 1 r p (1) (D L) 1 r p (2) (D U) 1 Lp (1) + (D U) 1 r p (2) (D U) 1 L(D L) 1 r + (D U) 1 r p (2) [ (D U) 1 L(D L) 1 + (D U) 1] r One iteration of FGS One iteration of BGS So, we can think of doing one iteration of SGS on Ap = r as p [ (D U) 1 L(D L) 1 + (D U) 1] r = M 1 r We can of course simplify this expression to make it a bit more manageable. Let s concentrate on just the matrix
13 M 1 = [ (D U) 1 L(D L) 1 + (D U) 1] = (D U) 1 [ L(D L) 1 + I ] = (D U) 1 [ L(D L) 1 + (D L) (D L) 1] = (D U) 1 [L + (D L)] (D L) 1 = (D U) 1 D(D L) 1 So, one preconditioning step of SGS involves hitting the residual by M 1 = (D U) 1 D(D L) 1. We now want to convince ourselves that the matrix M is symmetric. To do this we note that M = (D L) D 1 (D U) Now, note that by construction, if the matrix A is symmetric, then U and L are transposes of each other. In other words U T = L and L T = U. Notice also that since D is diagonal we have D T = D and D T = D 1. We then have so, M is symmetric. M T = [ (D L) D 1 (D U) ] T = (D U) T D T (D L) T = ( D T U T ) D T ( D T L T ) = (D L) D 1 (D U) = M
Iterative Methods for Solving A x = b
Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http
More informationLecture # 20 The Preconditioned Conjugate Gradient Method
Lecture # 20 The Preconditioned Conjugate Gradient Method We wish to solve Ax = b (1) A R n n is symmetric and positive definite (SPD). We then of n are being VERY LARGE, say, n = 10 6 or n = 10 7. Usually,
More informationNotes on Some Methods for Solving Linear Systems
Notes on Some Methods for Solving Linear Systems Dianne P. O Leary, 1983 and 1999 and 2007 September 25, 2007 When the matrix A is symmetric and positive definite, we have a whole new class of algorithms
More information9.1 Preconditioned Krylov Subspace Methods
Chapter 9 PRECONDITIONING 9.1 Preconditioned Krylov Subspace Methods 9.2 Preconditioned Conjugate Gradient 9.3 Preconditioned Generalized Minimal Residual 9.4 Relaxation Method Preconditioners 9.5 Incomplete
More information4.6 Iterative Solvers for Linear Systems
4.6 Iterative Solvers for Linear Systems Why use iterative methods? Virtually all direct methods for solving Ax = b require O(n 3 ) floating point operations. In practical applications the matrix A often
More informationLecture 11. Fast Linear Solvers: Iterative Methods. J. Chaudhry. Department of Mathematics and Statistics University of New Mexico
Lecture 11 Fast Linear Solvers: Iterative Methods J. Chaudhry Department of Mathematics and Statistics University of New Mexico J. Chaudhry (UNM) Math/CS 375 1 / 23 Summary: Complexity of Linear Solves
More informationFrom Stationary Methods to Krylov Subspaces
Week 6: Wednesday, Mar 7 From Stationary Methods to Krylov Subspaces Last time, we discussed stationary methods for the iterative solution of linear systems of equations, which can generally be written
More informationCS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares
CS 542G: Robustifying Newton, Constraints, Nonlinear Least Squares Robert Bridson October 29, 2008 1 Hessian Problems in Newton Last time we fixed one of plain Newton s problems by introducing line search
More informationSolutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright.
Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright. John L. Weatherwax July 7, 2010 wax@alum.mit.edu 1 Chapter 5 (Conjugate Gradient Methods) Notes
More informationPETROV-GALERKIN METHODS
Chapter 7 PETROV-GALERKIN METHODS 7.1 Energy Norm Minimization 7.2 Residual Norm Minimization 7.3 General Projection Methods 7.1 Energy Norm Minimization Saad, Sections 5.3.1, 5.2.1a. 7.1.1 Methods based
More informationNotes for CS542G (Iterative Solvers for Linear Systems)
Notes for CS542G (Iterative Solvers for Linear Systems) Robert Bridson November 20, 2007 1 The Basics We re now looking at efficient ways to solve the linear system of equations Ax = b where in this course,
More informationThe amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.
AMSC/CMSC 661 Scientific Computing II Spring 2005 Solution of Sparse Linear Systems Part 2: Iterative methods Dianne P. O Leary c 2005 Solving Sparse Linear Systems: Iterative methods The plan: Iterative
More informationSome definitions. Math 1080: Numerical Linear Algebra Chapter 5, Solving Ax = b by Optimization. A-inner product. Important facts
Some definitions Math 1080: Numerical Linear Algebra Chapter 5, Solving Ax = b by Optimization M. M. Sussman sussmanm@math.pitt.edu Office Hours: MW 1:45PM-2:45PM, Thack 622 A matrix A is SPD (Symmetric
More informationConjugate Gradient (CG) Method
Conjugate Gradient (CG) Method by K. Ozawa 1 Introduction In the series of this lecture, I will introduce the conjugate gradient method, which solves efficiently large scale sparse linear simultaneous
More informationTopics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems
Topics The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems What about non-spd systems? Methods requiring small history Methods requiring large history Summary of solvers 1 / 52 Conjugate
More informationConjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)
Conjugate gradient method Descent method Hestenes, Stiefel 1952 For A N N SPD In exact arithmetic, solves in N steps In real arithmetic No guaranteed stopping Often converges in many fewer than N steps
More informationPDE Solvers for Fluid Flow
PDE Solvers for Fluid Flow issues and algorithms for the Streaming Supercomputer Eran Guendelman February 5, 2002 Topics Equations for incompressible fluid flow 3 model PDEs: Hyperbolic, Elliptic, Parabolic
More informationIterative methods for Linear System
Iterative methods for Linear System JASS 2009 Student: Rishi Patil Advisor: Prof. Thomas Huckle Outline Basics: Matrices and their properties Eigenvalues, Condition Number Iterative Methods Direct and
More informationThe Conjugate Gradient Method
The Conjugate Gradient Method Classical Iterations We have a problem, We assume that the matrix comes from a discretization of a PDE. The best and most popular model problem is, The matrix will be as large
More informationNumerical Methods I Non-Square and Sparse Linear Systems
Numerical Methods I Non-Square and Sparse Linear Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 September 25th, 2014 A. Donev (Courant
More informationBindel, Fall 2016 Matrix Computations (CS 6210) Notes for
1 Iteration basics Notes for 2016-11-07 An iterative solver for Ax = b is produces a sequence of approximations x (k) x. We always stop after finitely many steps, based on some convergence criterion, e.g.
More informationChapter 7 Iterative Techniques in Matrix Algebra
Chapter 7 Iterative Techniques in Matrix Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128B Numerical Analysis Vector Norms Definition
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 21: Sensitivity of Eigenvalues and Eigenvectors; Conjugate Gradient Method Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis
More informationLecture 11. Linear systems: Cholesky method. Eigensystems: Terminology. Jacobi transformations QR transformation
Lecture Cholesky method QR decomposition Terminology Linear systems: Eigensystems: Jacobi transformations QR transformation Cholesky method: For a symmetric positive definite matrix, one can do an LU decomposition
More informationLab 1: Iterative Methods for Solving Linear Systems
Lab 1: Iterative Methods for Solving Linear Systems January 22, 2017 Introduction Many real world applications require the solution to very large and sparse linear systems where direct methods such as
More informationThe Conjugate Gradient Method
The Conjugate Gradient Method Jason E. Hicken Aerospace Design Lab Department of Aeronautics & Astronautics Stanford University 14 July 2011 Lecture Objectives describe when CG can be used to solve Ax
More information7.3 The Jacobi and Gauss-Siedel Iterative Techniques. Problem: To solve Ax = b for A R n n. Methodology: Iteratively approximate solution x. No GEPP.
7.3 The Jacobi and Gauss-Siedel Iterative Techniques Problem: To solve Ax = b for A R n n. Methodology: Iteratively approximate solution x. No GEPP. 7.3 The Jacobi and Gauss-Siedel Iterative Techniques
More informationCLASSICAL ITERATIVE METHODS
CLASSICAL ITERATIVE METHODS LONG CHEN In this notes we discuss classic iterative methods on solving the linear operator equation (1) Au = f, posed on a finite dimensional Hilbert space V = R N equipped
More informationIterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)
Iterative methods for Linear System of Equations Joint Advanced Student School (JASS-2009) Course #2: Numerical Simulation - from Models to Software Introduction In numerical simulation, Partial Differential
More informationNumerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization
Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725 Consider Last time: proximal Newton method min x g(x) + h(x) where g, h convex, g twice differentiable, and h simple. Proximal
More informationNumerical Analysis: Solutions of System of. Linear Equation. Natasha S. Sharma, PhD
Mathematical Question we are interested in answering numerically How to solve the following linear system for x Ax = b? where A is an n n invertible matrix and b is vector of length n. Notation: x denote
More informationNotes on PCG for Sparse Linear Systems
Notes on PCG for Sparse Linear Systems Luca Bergamaschi Department of Civil Environmental and Architectural Engineering University of Padova e-mail luca.bergamaschi@unipd.it webpage www.dmsa.unipd.it/
More informationCourse Notes: Week 4
Course Notes: Week 4 Math 270C: Applied Numerical Linear Algebra 1 Lecture 9: Steepest Descent (4/18/11) The connection with Lanczos iteration and the CG was not originally known. CG was originally derived
More informationCS137 Introduction to Scientific Computing Winter Quarter 2004 Solutions to Homework #3
CS137 Introduction to Scientific Computing Winter Quarter 2004 Solutions to Homework #3 Felix Kwok February 27, 2004 Written Problems 1. (Heath E3.10) Let B be an n n matrix, and assume that B is both
More informationIterative Methods and Multigrid
Iterative Methods and Multigrid Part 3: Preconditioning 2 Eric de Sturler Preconditioning The general idea behind preconditioning is that convergence of some method for the linear system Ax = b can be
More informationLecture Notes: Geometric Considerations in Unconstrained Optimization
Lecture Notes: Geometric Considerations in Unconstrained Optimization James T. Allison February 15, 2006 The primary objectives of this lecture on unconstrained optimization are to: Establish connections
More informationCourse Notes: Week 1
Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues
More informationChapter 9 Implicit integration, incompressible flows
Chapter 9 Implicit integration, incompressible flows The methods we discussed so far work well for problems of hydrodynamics in which the flow speeds of interest are not orders of magnitude smaller than
More informationLecture 22. r i+1 = b Ax i+1 = b A(x i + α i r i ) =(b Ax i ) α i Ar i = r i α i Ar i
8.409 An Algorithmist s oolkit December, 009 Lecturer: Jonathan Kelner Lecture Last time Last time, we reduced solving sparse systems of linear equations Ax = b where A is symmetric and positive definite
More informationSolving Sparse Linear Systems: Iterative methods
Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/sccs Lecture Notes for Unit VII Sparse Matrix Computations Part 2: Iterative Methods Dianne P. O Leary c 2008,2010
More informationSolving Sparse Linear Systems: Iterative methods
Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/sccswebpage Lecture Notes for Unit VII Sparse Matrix Computations Part 2: Iterative Methods Dianne P. O Leary
More informationLecture 10: Powers of Matrices, Difference Equations
Lecture 10: Powers of Matrices, Difference Equations Difference Equations A difference equation, also sometimes called a recurrence equation is an equation that defines a sequence recursively, i.e. each
More informationSince the determinant of a diagonal matrix is the product of its diagonal elements it is trivial to see that det(a) = α 2. = max. A 1 x.
APPM 4720/5720 Problem Set 2 Solutions This assignment is due at the start of class on Wednesday, February 9th. Minimal credit will be given for incomplete solutions or solutions that do not provide details
More informationHOMEWORK 10 SOLUTIONS
HOMEWORK 10 SOLUTIONS MATH 170A Problem 0.1. Watkins 8.3.10 Solution. The k-th error is e (k) = G k e (0). As discussed before, that means that e (k+j) ρ(g) k, i.e., the norm of the error is approximately
More informationIterative techniques in matrix algebra
Iterative techniques in matrix algebra Tsung-Ming Huang Department of Mathematics National Taiwan Normal University, Taiwan September 12, 2015 Outline 1 Norms of vectors and matrices 2 Eigenvalues and
More informationJACOBI S ITERATION METHOD
ITERATION METHODS These are methods which compute a sequence of progressively accurate iterates to approximate the solution of Ax = b. We need such methods for solving many large linear systems. Sometimes
More informationThe Conjugate Gradient Method for Solving Linear Systems of Equations
The Conjugate Gradient Method for Solving Linear Systems of Equations Mike Rambo Mentor: Hans de Moor May 2016 Department of Mathematics, Saint Mary s College of California Contents 1 Introduction 2 2
More information, b = 0. (2) 1 2 The eigenvectors of A corresponding to the eigenvalues λ 1 = 1, λ 2 = 3 are
Quadratic forms We consider the quadratic function f : R 2 R defined by f(x) = 2 xt Ax b T x with x = (x, x 2 ) T, () where A R 2 2 is symmetric and b R 2. We will see that, depending on the eigenvalues
More informationITERATIVE METHODS BASED ON KRYLOV SUBSPACES
ITERATIVE METHODS BASED ON KRYLOV SUBSPACES LONG CHEN We shall present iterative methods for solving linear algebraic equation Au = b based on Krylov subspaces We derive conjugate gradient (CG) method
More informationLecture 7. Gaussian Elimination with Pivoting. David Semeraro. University of Illinois at Urbana-Champaign. February 11, 2014
Lecture 7 Gaussian Elimination with Pivoting David Semeraro University of Illinois at Urbana-Champaign February 11, 2014 David Semeraro (NCSA) CS 357 February 11, 2014 1 / 41 Naive Gaussian Elimination
More informationGradient Methods Using Momentum and Memory
Chapter 3 Gradient Methods Using Momentum and Memory The steepest descent method described in Chapter always steps in the negative gradient direction, which is orthogonal to the boundary of the level set
More information1 Conjugate gradients
Notes for 2016-11-18 1 Conjugate gradients We now turn to the method of conjugate gradients (CG), perhaps the best known of the Krylov subspace solvers. The CG iteration can be characterized as the iteration
More informationMath/Phys/Engr 428, Math 529/Phys 528 Numerical Methods - Summer Homework 3 Due: Tuesday, July 3, 2018
Math/Phys/Engr 428, Math 529/Phys 528 Numerical Methods - Summer 28. (Vector and Matrix Norms) Homework 3 Due: Tuesday, July 3, 28 Show that the l vector norm satisfies the three properties (a) x for x
More informationLINEAR SYSTEMS (11) Intensive Computation
LINEAR SYSTEMS () Intensive Computation 27-8 prof. Annalisa Massini Viviana Arrigoni EXACT METHODS:. GAUSSIAN ELIMINATION. 2. CHOLESKY DECOMPOSITION. ITERATIVE METHODS:. JACOBI. 2. GAUSS-SEIDEL 2 CHOLESKY
More informationEXAMPLES OF CLASSICAL ITERATIVE METHODS
EXAMPLES OF CLASSICAL ITERATIVE METHODS In these lecture notes we revisit a few classical fixpoint iterations for the solution of the linear systems of equations. We focus on the algebraic and algorithmic
More informationNumerical Optimization Prof. Shirish K. Shevade Department of Computer Science and Automation Indian Institute of Science, Bangalore
Numerical Optimization Prof. Shirish K. Shevade Department of Computer Science and Automation Indian Institute of Science, Bangalore Lecture - 13 Steepest Descent Method Hello, welcome back to this series
More informationChapter 2. Solving Systems of Equations. 2.1 Gaussian elimination
Chapter 2 Solving Systems of Equations A large number of real life applications which are resolved through mathematical modeling will end up taking the form of the following very simple looking matrix
More informationM.A. Botchev. September 5, 2014
Rome-Moscow school of Matrix Methods and Applied Linear Algebra 2014 A short introduction to Krylov subspaces for linear systems, matrix functions and inexact Newton methods. Plan and exercises. M.A. Botchev
More informationBindel, Fall 2011 Intro to Scientific Computing (CS 3220) Week 3: Wednesday, Jan 9
Problem du jour Week 3: Wednesday, Jan 9 1. As a function of matrix dimension, what is the asymptotic complexity of computing a determinant using the Laplace expansion (cofactor expansion) that you probably
More information6.4 Krylov Subspaces and Conjugate Gradients
6.4 Krylov Subspaces and Conjugate Gradients Our original equation is Ax = b. The preconditioned equation is P Ax = P b. When we write P, we never intend that an inverse will be explicitly computed. P
More information5.2 Infinite Series Brian E. Veitch
5. Infinite Series Since many quantities show up that cannot be computed exactly, we need some way of representing it (or approximating it). One way is to sum an infinite series. Recall that a n is the
More informationPreconditioning Techniques Analysis for CG Method
Preconditioning Techniques Analysis for CG Method Huaguang Song Department of Computer Science University of California, Davis hso@ucdavis.edu Abstract Matrix computation issue for solve linear system
More informationLecture 6. Regularized least-squares and minimum-norm methods 6 1
Regularized least-squares and minimum-norm methods 6 1 Lecture 6 Regularized least-squares and minimum-norm methods EE263 Autumn 2004 multi-objective least-squares regularized least-squares nonlinear least-squares
More informationLinear Solvers. Andrew Hazel
Linear Solvers Andrew Hazel Introduction Thus far we have talked about the formulation and discretisation of physical problems...... and stopped when we got to a discrete linear system of equations. Introduction
More informationPhysics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester
Physics 403 Numerical Methods, Maximum Likelihood, and Least Squares Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Quadratic Approximation
More informationLecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.
Lecture 11: CMSC 878R/AMSC698R Iterative Methods An introduction Outline Direct Solution of Linear Systems Inverse, LU decomposition, Cholesky, SVD, etc. Iterative methods for linear systems Why? Matrix
More informationUnconstrained optimization
Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout
More informationCHAPTER 6. Projection Methods. Let A R n n. Solve Ax = f. Find an approximate solution ˆx K such that r = f Aˆx L.
Projection Methods CHAPTER 6 Let A R n n. Solve Ax = f. Find an approximate solution ˆx K such that r = f Aˆx L. V (n m) = [v, v 2,..., v m ] basis of K W (n m) = [w, w 2,..., w m ] basis of L Let x 0
More informationEECS 275 Matrix Computation
EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 20 1 / 20 Overview
More informationParallel Numerics, WT 2016/ Iterative Methods for Sparse Linear Systems of Equations. page 1 of 1
Parallel Numerics, WT 2016/2017 5 Iterative Methods for Sparse Linear Systems of Equations page 1 of 1 Contents 1 Introduction 1.1 Computer Science Aspects 1.2 Numerical Problems 1.3 Graphs 1.4 Loop Manipulations
More informationOrthogonality. 6.1 Orthogonal Vectors and Subspaces. Chapter 6
Chapter 6 Orthogonality 6.1 Orthogonal Vectors and Subspaces Recall that if nonzero vectors x, y R n are linearly independent then the subspace of all vectors αx + βy, α, β R (the space spanned by x and
More informationInterior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems
AMSC 607 / CMSC 764 Advanced Numerical Optimization Fall 2008 UNIT 3: Constrained Optimization PART 4: Introduction to Interior Point Methods Dianne P. O Leary c 2008 Interior Point Methods We ll discuss
More information1 Extrapolation: A Hint of Things to Come
Notes for 2017-03-24 1 Extrapolation: A Hint of Things to Come Stationary iterations are simple. Methods like Jacobi or Gauss-Seidel are easy to program, and it s (relatively) easy to analyze their convergence.
More informationQuasi-Newton Methods
Newton s Method Pros and Cons Quasi-Newton Methods MA 348 Kurt Bryan Newton s method has some very nice properties: It s extremely fast, at least once it gets near the minimum, and with the simple modifications
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 18 Outline
More informationPowerPoints organized by Dr. Michael R. Gustafson II, Duke University
Part 3 Chapter 10 LU Factorization PowerPoints organized by Dr. Michael R. Gustafson II, Duke University All images copyright The McGraw-Hill Companies, Inc. Permission required for reproduction or display.
More informationLinear Least-Squares Data Fitting
CHAPTER 6 Linear Least-Squares Data Fitting 61 Introduction Recall that in chapter 3 we were discussing linear systems of equations, written in shorthand in the form Ax = b In chapter 3, we just considered
More informationLecture 18 Classical Iterative Methods
Lecture 18 Classical Iterative Methods MIT 18.335J / 6.337J Introduction to Numerical Methods Per-Olof Persson November 14, 2006 1 Iterative Methods for Linear Systems Direct methods for solving Ax = b,
More informationNumerical Methods - Numerical Linear Algebra
Numerical Methods - Numerical Linear Algebra Y. K. Goh Universiti Tunku Abdul Rahman 2013 Y. K. Goh (UTAR) Numerical Methods - Numerical Linear Algebra I 2013 1 / 62 Outline 1 Motivation 2 Solving Linear
More informationNumerical Linear Algebra
Numerical Linear Algebra The two principal problems in linear algebra are: Linear system Given an n n matrix A and an n-vector b, determine x IR n such that A x = b Eigenvalue problem Given an n n matrix
More informationName: INSERT YOUR NAME HERE. Due to dropbox by 6pm PDT, Wednesday, December 14, 2011
AMath 584 Name: INSERT YOUR NAME HERE Take-home Final UWNetID: INSERT YOUR NETID Due to dropbox by 6pm PDT, Wednesday, December 14, 2011 The main part of the assignment (Problems 1 3) is worth 80 points.
More informationAMS526: Numerical Analysis I (Numerical Linear Algebra)
AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 24: Preconditioning and Multigrid Solver Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 5 Preconditioning Motivation:
More informationLecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University
Lecture Note 7: Iterative methods for solving linear systems Xiaoqun Zhang Shanghai Jiao Tong University Last updated: December 24, 2014 1.1 Review on linear algebra Norms of vectors and matrices vector
More informationIterative Methods. Splitting Methods
Iterative Methods Splitting Methods 1 Direct Methods Solving Ax = b using direct methods. Gaussian elimination (using LU decomposition) Variants of LU, including Crout and Doolittle Other decomposition
More informationCSC321 Lecture 5 Learning in a Single Neuron
CSC321 Lecture 5 Learning in a Single Neuron Roger Grosse and Nitish Srivastava January 21, 2015 Roger Grosse and Nitish Srivastava CSC321 Lecture 5 Learning in a Single Neuron January 21, 2015 1 / 14
More informationIterative Linear Solvers
Chapter 10 Iterative Linear Solvers In the previous two chapters, we developed strategies for solving a new class of problems involving minimizing a function f ( x) with or without constraints on x. In
More informationConjugate Gradients: Idea
Overview Steepest Descent often takes steps in the same direction as earlier steps Wouldn t it be better every time we take a step to get it exactly right the first time? Again, in general we choose a
More informationProgramming, numerics and optimization
Programming, numerics and optimization Lecture C-3: Unconstrained optimization II Łukasz Jankowski ljank@ippt.pan.pl Institute of Fundamental Technological Research Room 4.32, Phone +22.8261281 ext. 428
More informationComputational Linear Algebra
Computational Linear Algebra PD Dr. rer. nat. habil. Ralf Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2017/18 Part 2: Direct Methods PD Dr.
More informationNumerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725
Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: proximal gradient descent Consider the problem min g(x) + h(x) with g, h convex, g differentiable, and h simple
More informationEigenvectors and Hermitian Operators
7 71 Eigenvalues and Eigenvectors Basic Definitions Let L be a linear operator on some given vector space V A scalar λ and a nonzero vector v are referred to, respectively, as an eigenvalue and corresponding
More informationIntroduction to Iterative Solvers of Linear Systems
Introduction to Iterative Solvers of Linear Systems SFB Training Event January 2012 Prof. Dr. Andreas Frommer Typeset by Lukas Krämer, Simon-Wolfgang Mages and Rudolf Rödl 1 Classes of Matrices and their
More information1 Error analysis for linear systems
Notes for 2016-09-16 1 Error analysis for linear systems We now discuss the sensitivity of linear systems to perturbations. This is relevant for two reasons: 1. Our standard recipe for getting an error
More informationComputational Linear Algebra
Computational Linear Algebra PD Dr. rer. nat. habil. Ralf Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2017/18 Part 3: Iterative Methods PD
More informationNonlinear Optimization
Nonlinear Optimization (Com S 477/577 Notes) Yan-Bin Jia Nov 7, 2017 1 Introduction Given a single function f that depends on one or more independent variable, we want to find the values of those variables
More informationEfficient Estimation of the A-norm of the Error in the Preconditioned Conjugate Gradient Method
Efficient Estimation of the A-norm of the Error in the Preconditioned Conjugate Gradient Method Zdeněk Strakoš and Petr Tichý Institute of Computer Science AS CR, Technical University of Berlin. International
More informationStat 206: Linear algebra
Stat 206: Linear algebra James Johndrow (adapted from Iain Johnstone s notes) 2016-11-02 Vectors We have already been working with vectors, but let s review a few more concepts. The inner product of two
More informationMath 1080: Numerical Linear Algebra Chapter 4, Iterative Methods
Math 1080: Numerical Linear Algebra Chapter 4, Iterative Methods M. M. Sussman sussmanm@math.pitt.edu Office Hours: MW 1:45PM-2:45PM, Thack 622 March 2015 1 / 70 Topics Introduction to Iterative Methods
More informationConjugate Gradient Method
Conjugate Gradient Method direct and indirect methods positive definite linear systems Krylov sequence spectral analysis of Krylov sequence preconditioning Prof. S. Boyd, EE364b, Stanford University Three
More informationLine Search Methods for Unconstrained Optimisation
Line Search Methods for Unconstrained Optimisation Lecture 8, Numerical Linear Algebra and Optimisation Oxford University Computing Laboratory, MT 2007 Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The Generic
More information