AMS526: Numerical Analysis I (Numerical Linear Algebra)

Similar documents
Chapter 7 Iterative Techniques in Matrix Algebra

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

EECS 275 Matrix Computation

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Lecture 18 Classical Iterative Methods

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Iterative methods for Linear System

Numerical Methods - Numerical Linear Algebra

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.

9.1 Preconditioned Krylov Subspace Methods

Conjugate Gradient (CG) Method

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Iterative Methods for Solving A x = b

Course Notes: Week 1

Notes on Some Methods for Solving Linear Systems

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

EIGENVALUE PROBLEMS (EVP)

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Parallel Numerics, WT 2016/ Iterative Methods for Sparse Linear Systems of Equations. page 1 of 1

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Iterative techniques in matrix algebra

Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)

AMS526: Numerical Analysis I (Numerical Linear Algebra)

PETROV-GALERKIN METHODS

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

Numerical Methods I Eigenvalue Problems

Math 577 Assignment 7

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Lab 1: Iterative Methods for Solving Linear Systems

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Course Notes: Week 4

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Computational Methods. Systems of Linear Equations

Iterative Methods. Splitting Methods

7.3 The Jacobi and Gauss-Siedel Iterative Techniques. Problem: To solve Ax = b for A R n n. Methodology: Iteratively approximate solution x. No GEPP.

Contents. Preface... xi. Introduction...

6.4 Krylov Subspaces and Conjugate Gradients

Lecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University

Conjugate Gradient Method

Linear Algebra: Matrix Eigenvalue Problems

Numerical Methods in Matrix Computations

Iterative Methods for Sparse Linear Systems

1 Extrapolation: A Hint of Things to Come

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

Computational Linear Algebra

EIGENVALUE PROBLEMS. EIGENVALUE PROBLEMS p. 1/4

Lecture 17 Methods for System of Linear Equations: Part 2. Songting Luo. Department of Mathematics Iowa State University

Iterative Linear Solvers

APPLIED NUMERICAL LINEAR ALGEBRA

Eigenvalue and Eigenvector Problems

Linear Algebraic Equations

From Stationary Methods to Krylov Subspaces

Iterative Methods for Smooth Objective Functions

Sparse Linear Systems. Iterative Methods for Sparse Linear Systems. Motivation for Studying Sparse Linear Systems. Partial Differential Equations

Introduction to Scientific Computing

Introduction. Chapter One

JACOBI S ITERATION METHOD

Chap 3. Linear Algebra

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization

4.6 Iterative Solvers for Linear Systems

Boundary Value Problems - Solving 3-D Finite-Difference problems Jacob White

The Conjugate Gradient Method

The Conjugate Gradient Method

Today: eigenvalue sensitivity, eigenvalue algorithms Reminder: midterm starts today

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

Computational Linear Algebra

The conjugate gradient method

Linear Solvers. Andrew Hazel

LINEAR ALGEBRA: NUMERICAL METHODS. Version: August 12,

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

7.2 Steepest Descent and Preconditioning

Algebra C Numerical Linear Algebra Sample Exam Problems

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Introduction to Numerical Analysis

Parallel Iterative Methods for Sparse Linear Systems. H. Martin Bücker Lehrstuhl für Hochleistungsrechnen

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Math/Phys/Engr 428, Math 529/Phys 528 Numerical Methods - Summer Homework 3 Due: Tuesday, July 3, 2018

Krylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17

1 Error analysis for linear systems

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Jordan Journal of Mathematics and Statistics (JJMS) 5(3), 2012, pp A NEW ITERATIVE METHOD FOR SOLVING LINEAR SYSTEMS OF EQUATIONS

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.

Scientific Computing: An Introductory Survey

Conjugate Gradient Method

CAAM 454/554: Stationary Iterative Methods

Lecture # 11 The Power Method for Eigenvalues Part II. The power method find the largest (in magnitude) eigenvalue of. A R n n.

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

Some definitions. Math 1080: Numerical Linear Algebra Chapter 5, Solving Ax = b by Optimization. A-inner product. Important facts

OUTLINE ffl CFD: elliptic pde's! Ax = b ffl Basic iterative methods ffl Krylov subspace methods ffl Preconditioning techniques: Iterative methods ILU

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

9. Iterative Methods for Large Linear Systems

Lecture 10b: iterative and direct linear solvers

Transcription:

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 21: Sensitivity of Eigenvalues and Eigenvectors; Conjugate Gradient Method Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 17

Outline 1 Sensitivity of Eigenvalues 2 Conjugate Gradient Method Xiangmin Jiao Numerical Analysis I 2 / 17

Residual of Eigenvalue Problems We consider how eigenvalues and eigenvectors of A + δa differ from of A A small residual r = Ax λx guarantees small perturbation to A in backward analysis Theorem Let A C n n, let x be an approximate eigenvector of A with x 2 = 1, λ an associated approximate eigenvalue, and r = Ax λx the residual. Then λ and x are an exact eigenpair of some perturbed matrix A + δa, where δa 2 = r 2. Proof. Let δa = rx, which satisfies δa 2 = r 2 x 2 = r 2. Then (A + δa)x = Ax rx x = Ax r = λx. Xiangmin Jiao Numerical Analysis I 3 / 17

Sensitivity of Eigenvalues Theorem Condition number of matrix X determine sensitivity of eigenvalues Let A C n n be a nondefective (semisimple) matrix, and suppose A = X ΛX 1, where X is nonsingular and Λ is diagonal. Let δa C n n be some perturbation of A,and let µ be an eigenvalue of A + δa. Then A has an eigenvalue λ such that for 1 p. µ λ κ p (X ) δa p κ p (X ) measures how far eigenvectors are from being linearly dependent For normal matrices, condition number κ p (X ) = 1 For defective matrices, consider condition number κ p (X ) as infinity Xiangmin Jiao Numerical Analysis I 4 / 17

Sensitivity of Eigenvalues Proof. Let δλ = X 1 (δa) X. Then δλ p X 1 p δa p X p = κ p (X ) δa p. Let y be an engenvector of A + δa associated with µ. Suppose µ is not an eigenvalue of A, so µi Λ is nonsingular, and (Λ + δλ)y = µy (µi Λ)y = (δλ) y y = (µi Λ) 1 (δλ) y, thus (µi Λ) 1 1 p δλ p. (µi Λ) 1 p = µ λ 1, where λ is the eigenvalue of A closest to µ. Thus, µ λ δλ p, Xiangmin Jiao Numerical Analysis I 5 / 17

Left and Right Eigenvectors We would like to analyze sensitivity of individual eigenvalues First, define left eigenvectors y associated with eigenvalue λ of matrix A to be nonzero vector y C n such that y A = λy Theorem Left eigenvectors of A are right eigenvectors of A Let A C n n have distinct eigenvalues λ 1, λ 2,..., λ n with associated linearly independent right eigenvectors x 1,..., x n and left eigenvectors y 1,..., y n. Then yj x i 0 if i = j and yj x i = 0 if i j. Proof. If i j, yj Ax i = λ i yj x i and yj Ax i = λ j yj x i. Since λ i λ j, yj x i = 0. If i = j, since {x i } form a basis for C n, yi x i = 0 together with yi x j = 0 would imply that y i = 0. This leads to a contradiction. Xiangmin Jiao Numerical Analysis I 6 / 17

Sensitivity of Individual Eigenvalues Theorem We analyze sensitivity of individual eigenvalues that are distinct Let A C n n have n distinct eigenvalues. Let λ be an eigenvalue with associated right and left eigenvectors x and y, respectively, normalized so that x 2 = y 2 = 1. Let δa be a small perturbation satisfying δa 2 = ɛ, and let λ + δλ be the eigenvalue of A + δa that approximates λ. Then δλ 1 y x ɛ + O(ɛ2 ). Let s = y x. κ = 1 s = 1 y x is condition number for eigenvalue λ A simple eigenvalue is sensitive if its associated right and left eigenvectors are nearly orthogonal Xiangmin Jiao Numerical Analysis I 7 / 17

Sensitivity of Individual Eigenvalues Proof. We know that δλ κ p (X )ɛ = O(ɛ). In addition, δx = O(ɛ) when λ is a simple eigenvalue. Because (A + δa) (x + δx) = (λ + δλ) (x + δx), thus (δa) x + A (δx) + O(ɛ 2 ) = (δλ)x + λ(δx) + O(ɛ 2 ). Left multiplying by y and using equation y A = λy, we obtain and hence y (δa) x + O(ɛ 2 ) = (δλ) y x + O(ɛ 2 ) δλ = y (δa) x y x + O(ɛ 2 ). Note that y (δa) x y 2 (δa) 2 x 2 = ɛ, so δλ 1 y x ɛ + O(ɛ2 ). Xiangmin Jiao Numerical Analysis I 8 / 17

Sensitivity of Multiple Eigenvalues and Eigenvectors Sensitivity of multiple eigenvalues is very complex For multiple eigenvalues, left and right eigenvectors can be orthogonal, hence very ill-conditioned In general, multiple or close eigenvalues can be poorly conditioned, especially if matrix is defective Condition numbers of eigenvectors are also difficult to analyze If matrix has well-conditioned and well-separated eigenvalues, then eigenvectors are well-conditioned If eigenvalues are ill-conditioned or closely clustered, then eigenvectors may be poorly conditioned Xiangmin Jiao Numerical Analysis I 9 / 17

Outline 1 Sensitivity of Eigenvalues 2 Conjugate Gradient Method Xiangmin Jiao Numerical Analysis I 10 / 17

Direct vs. Iterative Methods Direct methods, or noniterative methods, compute the exact solution after a finite number of steps (in exact arithmetic) Example: Gaussian elimination, QR factorization Iterative methods produce a sequence of approximations x (1), x (2),... that hopefully converge to the true solution Example: Jacobi, Conjugate Gradient (CG), GMRES, BiCG, etc. Caution: The boundary between direct and iterative methods is vague sometimes Why use iterative methods (instead of direct methods)? may be faster than direct methods produce useful intermediate results handle sparse matrices more easily (needs only matrix-vector product) often are easier to implement on parallel computers Question: When not to use iterative methods? Xiangmin Jiao Numerical Analysis I 11 / 17

Two Classes of Iterative Methods Stationary iterative methods find a splitting A = M K and iterates x (k+1) = M 1 (Kx (k) + b) Examples: Jacobi (for linear systems, not the Jacobi iterations for eigenvalues), Gauss-Seidel, Successive Over-Relaxation (SOR) etc. Krylov subspace methods find optimal solution in Krylov subspace {b, Ab, A 2 b, A k b} Build subspace successively Example: Conjugate Gradient (CG), Generalized Minimum Residual (GMRES), BiCG, etc. We will focus on Krylov subspace methods Xiangmin Jiao Numerical Analysis I 12 / 17

Krylov Subspace Algorithms Create a sequence of Krylov subspaces for Ax = b K n = {b, Ab,..., A n 1 b} and find an approximate (hopefully optimal) solutions x n in K n Only matrix-vector products involved For SPD matrices, most famous algorithm is Conjugate Gradient (CG) method discovered by Hestenes/Stiefel in 1952 Finds best solution xn K n in norm x A = x T Ax Only requires storing 4 vectors (instead of n vectors) due to three-term recurrence Xiangmin Jiao Numerical Analysis I 13 / 17

Motivation of Conjugate Gradients If A is m m SPD, then quadratic function ϕ(x) = 1 2 x T Ax x T b has unique minimum Negative gradient of this function is residual vector ϕ(x) = b Ax = r so minimum is obtained precisely when Ax = b Optimization methods have form x n+1 = x n + αp n where p n is search direction and α is step length chosen to minimize ϕ(x n + αp n ) Line search parameter can be determined analytically as α = r T n p n /p T n Ap n In CG, p n is chosen to be A-conjugate (or A-orthogonal) to previous search directions, i.e., p T n Ap j = 0 for j < n Xiangmin Jiao Numerical Analysis I 14 / 17

Conjugate Gradient Method Algorithm: Conjugate Gradient Method x 0 = 0, r 0 = b, p 0 = r 0 for n = 1 to 1, 2, 3,... α n = (rn 1 T r n 1)/(pn 1 T Ap n 1) x n = x n 1 + α n p n 1 r n = r n 1 α n Ap n 1 β n = (rn T r n )/(rn 1 T r n 1) p n = r n + β n p n 1 step length approximate solution residual improvement this step search direction Only one matrix-vector product Ap n 1 per iteration Apart from matrix-vector product, #operations per iteration is O(m) If A is sparse with constant number of nonzeros per row, O(m) operations per iteration CG can be viewed as minimization of quadratic function ϕ(x) = 1 2 x T Ax x T b by modifying steepest descent Xiangmin Jiao Numerical Analysis I 15 / 17

An Alternative Interpretation of CG Algorithm: CG x 0 = 0, r 0 = b, p 0 = r 0 for n = 1 to 1, 2, 3,... α n = r T n 1 r n 1/(p T n 1 Ap n 1) x n = x n 1 + α n p n 1 r n = r n 1 α n Ap n 1 β n = r T n r n /(r T n 1 r n 1) p n = r n + β n p n 1 Algorithm: A non-standard CG x 0 = 0, r 0 = b, p 0 = r 0 for n = 1 to 1, 2, 3,... α n = r T n 1 p n 1/(p T n 1 Ap n 1) x n = x n 1 + α n p n 1 r n = b Ax n β n = r T n Ap n 1 /(p T n 1 Ap n 1) p n = r n + β n p n 1 The non-standard one is less efficient but easier to understand It is easy to see r n = r n 1 α n Ap n 1 = b Ax n We need to show: α n minimizes ϕ along search direction p n αn and β n are equivalent to those in standard CG Minimizing ϕ along pn also minimizes ϕ within Krylov subspace Xiangmin Jiao Numerical Analysis I 16 / 17

Optimality of Step Length Select step length α n over vector p n 1 to minimize ϕ(x) = 1 2 x T Ax x T b Let x n = x n 1 + α n p n 1, ϕ(x n ) = 1 2 (x n 1 + α n p n 1 ) T A(x n 1 + α n p n 1 ) (x n 1 + α n p n 1 ) T b Therefore, = 1 2 α2 np T n 1Ap n 1 + α n p T n 1Ax n 1 α n p T n 1b + constant = 1 2 α2 np T n 1Ap n 1 α n p T n 1r n 1 + constant dϕ dα n = 0 α n p T n 1Ap n 1 p T n 1r n 1 = 0 α n = pt n 1 r n 1 pn 1 T Ap. n 1 In addition, p T n 1 r n 1 = r T n 1 r n 1 because p n 1 = r n 1 + β n p n 2 and r T n 1 p n 2 = 0 due to the following theorem. Xiangmin Jiao Numerical Analysis I 17 / 17