Krylov Subspace Methods for the Evaluation of Matrix Functions. Applications and Algorithms

Similar documents
ETNA Kent State University

Krylov Subspace Methods for the Evaluation of Matrix Functions. Applications and Algorithms

PDEs, Matrix Functions and Krylov Subspace Methods

Introduction to Iterative Solvers of Linear Systems

ANY FINITE CONVERGENCE CURVE IS POSSIBLE IN THE INITIAL ITERATIONS OF RESTARTED FOM

Key words. conjugate gradients, normwise backward error, incremental norm estimation.

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

arxiv: v1 [hep-lat] 2 May 2012

Karhunen-Loève Approximation of Random Fields Using Hierarchical Matrix Techniques

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Exponentials of Symmetric Matrices through Tridiagonal Reductions

Matrix Functions and their Approximation by. Polynomial methods

M.A. Botchev. September 5, 2014

EQUIVALENCE OF CONDITIONS FOR CONVERGENCE OF ITERATIVE METHODS FOR SINGULAR LINEAR SYSTEMS

On Lagrange multipliers of trust region subproblems

On the Vorobyev method of moments

OPTIMAL SCALING FOR P -NORMS AND COMPONENTWISE DISTANCE TO SINGULARITY

Matrix functions and their approximation. Krylov subspaces

Two Results About The Matrix Exponential

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

Contribution of Wo¹niakowski, Strako²,... The conjugate gradient method in nite precision computa

Spectral Theorem for Self-adjoint Linear Operators

Introduction. Chapter One

FEM and sparse linear system solving

Lecture notes: Applied linear algebra Part 1. Version 2

Tikhonov Regularization of Large Symmetric Problems

C M. A two-sided short-recurrence extended Krylov subspace method for nonsymmetric matrices and its relation to rational moment matching

Algebraic Multigrid Preconditioners for Computing Stationary Distributions of Markov Processes

NOTES ON LINEAR ODES

Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method

We first repeat some well known facts about condition numbers for normwise and componentwise perturbations. Consider the matrix

Positive entries of stable matrices

1 Math 241A-B Homework Problem List for F2015 and W2016

Math 504 (Fall 2011) 1. (*) Consider the matrices

On the Ritz values of normal matrices

Geometric Mapping Properties of Semipositive Matrices

Large-scale eigenvalue problems

RESIDUAL SMOOTHING AND PEAK/PLATEAU BEHAVIOR IN KRYLOV SUBSPACE METHODS

Solving large sparse eigenvalue problems

Approximating the matrix exponential of an advection-diffusion operator using the incomplete orthogonalization method

A Method for Constructing Diagonally Dominant Preconditioners based on Jacobi Rotations

Efficient Wavefield Simulators Based on Krylov Model-Order Reduction Techniques

Numerical Methods in Matrix Computations

DEFLATED RESTARTING FOR MATRIX FUNCTIONS

Quantum Computing Lecture 2. Review of Linear Algebra

Computing the Action of the Matrix Exponential

ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems

Lecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Numerical Methods for Solving Large Scale Eigenvalue Problems

Non-stationary extremal eigenvalue approximations in iterative solutions of linear systems and estimators for relative error

M. VAN BAREL Department of Computing Science, K.U.Leuven, Celestijnenlaan 200A, B-3001 Heverlee, Belgium

On Algebraic and Semialgebraic Groups and Semigroups

On the Superlinear Convergence of MINRES. Valeria Simoncini and Daniel B. Szyld. Report January 2012

On Lagrange multipliers of trust-region subproblems

Numerical Methods I Eigenvalue Problems

On the influence of eigenvalues on Bi-CG residual norms

The Rate of Convergence of GMRES on a Tridiagonal Toeplitz Linear System

Numerical Methods - Numerical Linear Algebra

On the Perturbation of the Q-factor of the QR Factorization

MATH 304 Linear Algebra Lecture 34: Review for Test 2.

Iterative methods for positive definite linear systems with a complex shift

On prescribing Ritz values and GMRES residual norms generated by Arnoldi processes

Quantum Computing Lecture 3. Principles of Quantum Mechanics. Anuj Dawar

Orthogonal Symmetric Toeplitz Matrices

Structured Krylov Subspace Methods for Eigenproblems with Spectral Symmetries

Chap 4. State-Space Solutions and

Sensitivity of Gauss-Christoffel quadrature and sensitivity of Jacobi matrices to small changes of spectral data

Exercises * on Linear Algebra

ON THE HÖLDER CONTINUITY OF MATRIX FUNCTIONS FOR NORMAL MATRICES

z, w = z 1 w 1 + z 2 w 2 z, w 2 z 2 w 2. d([z], [w]) = 2 φ : P(C 2 ) \ [1 : 0] C ; [z 1 : z 2 ] z 1 z 2 ψ : P(C 2 ) \ [0 : 1] C ; [z 1 : z 2 ] z 2 z 1

Qualifying Examination HARVARD UNIVERSITY Department of Mathematics Tuesday, January 19, 2016 (Day 1)

Stability Theory for Nonnegative and Compartmental Dynamical Systems with Time Delay

LINEAR ALGEBRA 1, 2012-I PARTIAL EXAM 3 SOLUTIONS TO PRACTICE PROBLEMS

ETNA Kent State University

A path following interior-point algorithm for semidefinite optimization problem based on new kernel function. djeffal

APPROXIMATION OF THE LINEAR THE BLOCK SHIFT-AND-INVERT KRYLOV SUBSPACE METHOD

Iterative methods for Linear System

SOME PROPERTIES OF SYMPLECTIC RUNGE-KUTTA METHODS

Review of Linear Algebra Definitions, Change of Basis, Trace, Spectral Theorem

A Divide-and-Conquer Method for the Takagi Factorization

Linear ODEs. Existence of solutions to linear IVPs. Resolvent matrix. Autonomous linear systems

Krylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17

Numerical behavior of inexact linear solvers

Elements of Positive Definite Kernel and Reproducing Kernel Hilbert Space

Generalized MINRES or Generalized LSQR?

Eigenvalues and eigenvectors

J-SPECTRAL FACTORIZATION

Numerische Mathematik

A Concise Course on Stochastic Partial Differential Equations

Computational Linear Algebra

Chapter 12 Solving secular equations

Complex Analysis Topic: Singularities

Iterative Methods for Sparse Linear Systems

MATH 590: Meshfree Methods

Recycling Bi-Lanczos Algorithms: BiCG, CGS, and BiCGSTAB

Quantum Physics II (8.05) Fall 2002 Assignment 3

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

Exponential of a nonnormal matrix

Transcription:

Krylov Subspace Methods for the Evaluation of Matrix Functions. Applications and Algorithms 4. Monotonicity of the Lanczos Method Michael Eiermann Institut für Numerische Mathematik und Optimierung Technische Universität Bergakademie Freiberg, Germany Wintersemester 2/ Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 / 24

Outline An observation 2 A first result 3 Strict monotonicity 4 M-matrices 5 Special functions, Stieltjes functions 6 The main theorem Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 2 / 24

An observation We solve our model problem (the -D heat equation) whose semi-discrete version reads as u (t) = Au(t), t >, u() = b given, where A = h 2 tridiag(, 2, ) R n n, h = /(n + ). Its solution is u(t) = exp(ta)b. First step: Hermitian Lanczos process Given A C n n Hermitian, b C n, f such that f (A) is defined. w = b, v = For m =, 2,... β m = w ( := 2 ) v m = w/β m w = Av m β mv m α m = v H mw w = w α mv m End Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 3 / 24

The columns of V m = [v v 2 v m ] are an ON basis of K m (A, b) and the tridiagonal matrix 2 3 α β 2 β 2 α 2 β 3 β 3 α 3 T m =. R m m.. 6 7 4 5 α m β m β m α m represents the compression of A onto K m (A, b), i.e., T m = V H m AV m. Note that T m is real, α m = v H m(av m β m v m ) = v H mav m [λ min (A), λ max (A)] because A is Hermitian. Second step: Lanczos approximation to f (A)b, f m = β V m exp(t m )e = V m exp(v H m AV m )V H m b. We use expm (scaling and squaring) to calculate exp(t m ). Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 4 / 24

The model problem 2 4 6 exp(a)b f m exp(a)b f m 8 2 2 4 6 8 n = 99, b = rand(n, ): We observe monotone convergence. Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 5 / 24

A first result Theorem [Druskin (28)]. Let A C n n be Hermitian and b C n. For the Lanczos approximants f m, m =, 2,..., L, to f = exp(a)b, there holds f f 2 f L = f, f f f f 2 f f L =. Proof. First assume that A is positive definite. Then T m O (entrywise). This implies O T m := T m. T m. β m β m α m = T m. Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 6 / 24

Thus O T k m = [ T k m ] Tm k for all k =,, 2,... Since exp(t ) = I + T + 2 T 2 + + k! T k +, I m exp( T m ) = [ exp(tm ) ] exp(t m ). In particular e exp( T m )e = [ exp(tm )e ] exp(t m )e. Finally, exp(t m )e exp(t m )e and, since V m has orthonormal columns and β >, f m = β V m exp(t m )e = β exp(t m )e β exp(t m )e = β V m exp(t m )e = f m. Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 7 / 24

The monotonicity of the errors follows immediately: We have [ ] [ ] exp(tm )e exp(tm )e exp(t L )e, which implies exp(t L )e exp(t L )e [ exp(tm )e ] [ exp(tm )e exp(t L )e ] and thus exp(t L)e [ exp(tm )e ] exp(t L)e [ exp(tm )e ]. The assertion now follows from the observation f f m = β V L exp(t L )e V m f (T m )e [ f (Tm )e = β V L exp(t L )e V L ] [ f (Tm )e = β exp(t L )e ]. Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 8 / 24

If A is an arbitrary Hermitian matrix we choose a shift µ such that B = A + µi is positive definite. The Arnoldi approximations f (B) m to are given by f (B) m = β V (B) m exp ( T (B) m (easy exercise). This shows f (A) m exp(b)b = exp(µ) exp(a)b ) ( ) e = β V (A) m exp T (A) m + µi e = exp(µ)f (A) m = exp( µ)f (B) m, exp(a)b f (A) m = exp( µ) ( exp(b)b f (B) m ) which proves the theorem. Note we showed more than we claimed: We have not only normwise but componentwise (with respect to the basis V L ) monotonicity. Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 9 / 24

Strict monotonicity The monotoncity results described in the previous theorem can be sharpened: (exercise!). < f < f 2 < < f L = f, f f > f f 2 > > f f L = Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 / 24

M-matrices T = [t i, j ] R m m is a (nonsingular) M-matrix (Hermann Minkowski) if t i, j for all i j, T exists and T O. We need the following properties of M-matrices. Let A R n n have nonpositive off-diagonal entries. Then A is an M-matrix all eigenvalues of A have positive real parts. (M ) If A, B R n n are two M-matrices, then A B O B A. (M 2 ) For A, E R n n, let A be an M-matrix and let A + E have nonpositive off-diagonal entries, then E O A + E is an M-matrix. (M 3 ) Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 / 24

Special functions We consider functions f : (, ) R which can be represented as f (z) = dµ(t), z >. (t + z) k Here k N and µ is a nonnegative measure for which t k dµ(t) is finite. Example. Let δ x denote the Dirac measure (i.e., δ x (M) = if x M and δ x (M) = otherwise). Then j= z k = (t + z) k dδ (t). More generally, for x j >, π j >, (j =, 2,..., m), m π j (z + x j ) k = m (t + z) k d π j δ xj (t). j= Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 2 / 24

Stieltjes integrals and Stieltjes transformation Let [α, β] be a real finite closed interval and ψ : [α, β] R. Let : α = τ < τ < < τ m = β a subdivision of [α, β] with norm := max j m (τ j τ j ). A set of pivotal elements, Θ : τ < τ 2 < < τ m, consistent with consists of numbers τ j with τ j τ j τ j (j =, 2,..., m). For any (complex valued) function f defined on [α, β], set m S(, Θ) := f (τ j )(ψ(τ j) ψ(τ j )). j= If there is a complex number S such that, given any ε >, a number δ = ε(δ) exists such that S(, Θ) S ε for all subdivisions with δ and all consistent Θ, then S = β α f (t) dψ(t) is called the Stieltjes integral of f with respect to ψ on [α, β]. Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 3 / 24

If ψ(t) = t + γ for some constant γ the Stieltjes integral is the Riemann integral. If ψ is continuously differentiable on [α, β], then β α f (t) dψ(t) = β α f (t)ψ (t) dt. If ψ is a step function with finitely many jumps at ζ, ζ 2,..., ζ m, i.e.,, α t ζ, ψ(t) = k j= π j, ζ k < t ζ k+, m j= π j, ζ m < t β, then β α f (t) dψ(t) = m π j f (ζ j ). j= Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 4 / 24

If f is continuous and ψ is nondecreasing on [α, β] then β α f (t) dψ(t) exists. If f is continuous and ψ is nondecreasing on [α, ] we set α f (t) dψ(t) = lim β β α f (t) dψ(t) provided the limit exists. If f is continuous and bounded on [α, ] and if ψ is nondecreasing and bounded on [α, ] then α f (t) dψ(t) exists. Let ψ : [, ) R be nondecreasing and bounded. We call ζ > a point of increase of ψ if ψ is not constant on any interval [ζ ε, ζ + ε], ε >. Case. ψ has finitely many points of increase. Then ψ is a step function with finitely many jumps ζ j, j =, 2,..., m (namely at the points of increase). There holds m z + t dψ(t) = j= ψ(ζ j +) ψ(ζ j ) z + ζ j =: r(z) r a rational function with simple poles on the negative real axis and positive residues. Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 5 / 24

Moreover, (i) r is analytic in, (ii) r(x) for x, (iii) r(u) L and r(l) U, where U and L denote the upper and lower half-plane, respectively. Functions satisfying (i) (iii) are called positive symmetric rational functions. Every symmetric rational functions r of type (m, m) and (m, m) can be written as r(z) = α + z + t dψ(t) with α and a nondecreasing function ψ : [, ) R which has finitely many points of increase. Case 2. ψ has infinitely many points of increase. Then f (z) = z + t dψ(t) exists for all z C \ (, ) and is an analytic function there. f is called the Stieltjes transform of ψ. Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 6 / 24

f (z) = log( + z ) when ψ(t) = t if t and ψ(t) = for t. f (z) = arctan(/ z)/ z when ψ(t) = t if t and ψ(t) = for t. f (z) = z α, α (, ) when ψ(t) = sin(( α)π) π t α. f (z) = z α ( + z) β, < α, α + β <. If ψ is the distribution function of the measure µ, i.e., ψ(x) = µ([, x]) = x dµ(t), and if w(t) is the associated density function, then (under suitable conditions) z + t dµ(t) = z + t dψ(t) = w(t) z + t dt. Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 7 / 24

The main theorem Theorem [Frommer]. Let A C n n be Hermitian positive definite and b C n. Assume that the function f : (, ) R can be written as f (z) = dµ(t), z >, (t + z) k with a nonnegative measure µ and k N. For the Lanczos approximants f m to f (A)b and the resulting errors d m = f (A)b f m, there holds: { f m } m L is monotonically increasing. { d m } m L is monotonically decreasing. Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 8 / 24

Proof. Step. For the matrix S m = diag(,,..., ( ) m ) R m m, there holds: S T m = S m and S 2 m = I m, i.e., S m = S T m = S m, The columns of V m S m = [v v 2 ( ) m+ v m ] =: V ± m form an ON basis of K m (A, b). T ± m := S m T m S m = α β 2 β 2 α 2 β 3 β 3 α 3... α m β m β m has nonpositive off-diagonal entries. If A and therefore T m as well as T ± m are positive definite, then T ± m and T ± m + ti m, t, are M-matrices ((M ) and (M 3 )). Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 9 / 24 α m

Step 2. We can write the Lanczos approximants f m in the form f m = β V m f (T m )e = β V m S m f (S m T m S m )S m e = β V ± m f (T ± m )e. Consequently, f m = y m, where y m := β f (T ± m )e. For the special functions f which we consider here, there holds y m = β (ti m + T m ± ) k e dµ(t) Step 3. We define T ± m := [ T ± m α m ]. Then ti m + T ± m ti m + T ± m for all t. By (M 3 ) ti m + T ± m is an M-matrix for all t and O (ti m + T ± m ) (ti m + T ± m ) for all T by (M 2 ). Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 2 / 24

Thus, for every t and as well as O But this is just O (ti m + T ± m ) k (ti m + T ± m ) k (ti m + T ± m ) k dµ(t) (ti m + T ± m ) k e dµ(t) [ y m ] y m which is equivalent to f m f m. (ti m + T ± m ) k dµ(t) (ti m + T ± m ) k e dµ(t). Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 2 / 24

Step 4. The monotonicity of the errors d m = f (A)b β V m f (T m )e = V L y L V m y m follows from d m = V L (y L [ y m ]). Remark. For the Dirac measure µ = δ there holds δ (M) = if M and δ (M) = if M, (t + z) k dµ(t) = z k. For k = this means that the errors of the CG method decrease monotonically wrt 2 (for a different proof, see [Steihaug (983)]). Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 22 / 24

An extension. We can apply the monotonicity results to functions of the form g(z) = f (z)p(z), where f is as above and p is a polynomial (of low degree). We write g(a)b = f (A) b with b = p(a)b and apply the Lanczos method in the Krylov spaces K m (A, b). E.g., sign(a)b = (A 2 ) /2 Ab, which suggests to approximate B /2 b with B = A 2 (Hermitian positive definite if A is Hermitian) and b = Ab, i.e., we work in K m (A 2, Ab). Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 23 / 24

Hints to the literature C. Berg. Quelques remarques sur le cône de Stieltjes. Lecture Notes in Mathematics 84, Springer, Berlin, Heidelberg 984. A. Berman and R. J. Plemmons. Nonnegative Matrices in the Mathematical Sciences. Academic Press, New York 979. Updated edition, Classics in Applied Mathematics Vol. 9, SIAM, Philadelphia 994. V. Druskin. On monotonicity of the Lanczos approximation to the matrix exponential. Linear Algebra Appl. 429, 679 683 (28). A. Frommer. Monotone convergence of the Lanczos approximations to matrix functions of Hermitian matrices. Electron. Trans. Numer. Anal. 35, 8 28 (29). T. Fujimoto and R. R. Ranade. Two characterizations of inverse-positive matrices: the Hawkins-Simon condition and the Le Chatelier-Braun principle. Electron. J. Linear Algebra, 59 65 (24). P. Henrici. Applied and Computational Complex Analysis. Vol. 2: Special Functions Integral Transforms Asymptotics Continued Fractions. Jon Wiley & Sons, New York 977. T. Steihaug. The conjugate gradient method and trust regions in large scale optimization. SIAM J. Numer. Anal. 2, 626 637 (983). Michael Eiermann (TU Freiberg) Matrix Functions WS 2/2 24 / 24