Sensitivity of Gauss-Christoffel quadrature and sensitivity of Jacobi matrices to small changes of spectral data

Similar documents
Lanczos tridiagonalization, Krylov subspaces and the problem of moments

Moments, Model Reduction and Nonlinearity in Solving Linear Algebraic Problems

Matching moments and matrix computations

Lanczos tridigonalization and Golub - Kahan bidiagonalization: Ideas, connections and impact

Efficient Estimation of the A-norm of the Error in the Preconditioned Conjugate Gradient Method

On the Vorobyev method of moments

Efficient Estimation of the A-norm of the Error in the Preconditioned Conjugate Gradient Method

Golub-Kahan iterative bidiagonalization and determining the noise level in the data

Key words. conjugate gradients, normwise backward error, incremental norm estimation.

Introduction. Chapter One

The Lanczos and conjugate gradient algorithms

Composite convergence bounds based on Chebyshev polynomials and finite precision conjugate gradient computations

Contribution of Wo¹niakowski, Strako²,... The conjugate gradient method in nite precision computa

Principles and Analysis of Krylov Subspace Methods

AN ITERATIVE METHOD WITH ERROR ESTIMATORS

On the properties of Krylov subspaces in finite precision CG computations

On numerical stability in large scale linear algebraic computations

APPLIED NUMERICAL LINEAR ALGEBRA

On the interplay between discretization and preconditioning of Krylov subspace methods

LARGE SPARSE EIGENVALUE PROBLEMS. General Tools for Solving Large Eigen-Problems

LARGE SPARSE EIGENVALUE PROBLEMS

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

On Numerical Stability in Large Scale Linear Algebraic Computations

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Total least squares. Gérard MEURANT. October, 2008

Non-stationary extremal eigenvalue approximations in iterative solutions of linear systems and estimators for relative error

On the Superlinear Convergence of MINRES. Valeria Simoncini and Daniel B. Szyld. Report January 2012

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

On prescribing Ritz values and GMRES residual norms generated by Arnoldi processes

Using Godunov s Two-Sided Sturm Sequences to Accurately Compute Singular Vectors of Bidiagonal Matrices.

Lecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm

A Chebyshev-based two-stage iterative method as an alternative to the direct solution of linear systems

On the influence of eigenvalues on Bi-CG residual norms

1 Number Systems and Errors 1

A Chebyshev-based two-stage iterative method as an alternative to the direct solution of linear systems

Numerical Methods in Matrix Computations

Iterative methods for Linear System

High performance Krylov subspace method variants

1 Conjugate gradients

On the loss of orthogonality in the Gram-Schmidt orthogonalization process

CALCULATION OF GAUSS-KRONROD QUADRATURE RULES. 1. Introduction A(2n+ 1)-point Gauss-Kronrod integration rule for the integral.

M.A. Botchev. September 5, 2014

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH

The Conjugate Gradient Method

ON EFFICIENT NUMERICAL APPROXIMATION OF THE BILINEAR FORM c A 1 b

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

Iterative methods for Linear System of Equations. Joint Advanced Student School (JASS-2009)

Stability of Krylov Subspace Spectral Methods

O jednom starém článku a jeho mnohých souvislostech On an old article and its many connections

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Introduction to Applied Linear Algebra with MATLAB

1. Introduction. Let µ(t) be a distribution function with infinitely many points of increase in the interval [ π, π] and let

Reduced Synchronization Overhead on. December 3, Abstract. The standard formulation of the conjugate gradient algorithm involves

Generalized MINRES or Generalized LSQR?

Chapter 7 Iterative Techniques in Matrix Algebra

Introduction to Iterative Solvers of Linear Systems

6.4 Krylov Subspaces and Conjugate Gradients

COMPUTATION OF GAUSS-KRONROD QUADRATURE RULES

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Iterative solvers for saddle point algebraic linear systems: tools of the trade. V. Simoncini

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

EIGIFP: A MATLAB Program for Solving Large Symmetric Generalized Eigenvalue Problems

Charles University Faculty of Mathematics and Physics DOCTORAL THESIS. Krylov subspace approximations in linear algebraic problems

Large-scale eigenvalue problems

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method

Numerical behavior of inexact linear solvers

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

Iterative methods for symmetric eigenvalue problems

Bounds for the Matrix Condition Number

The Nullspace free eigenvalue problem and the inexact Shift and invert Lanczos method. V. Simoncini. Dipartimento di Matematica, Università di Bologna

4.6 Iterative Solvers for Linear Systems

Iterative Algorithm for Computing the Eigenvalues

Matrix Algorithms. Volume II: Eigensystems. G. W. Stewart H1HJ1L. University of Maryland College Park, Maryland

Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method

Implicitly Defined High-Order Operator Splittings for Parabolic and Hyperbolic Variable-Coefficient PDE Using Modified Moments

Some minimization problems

MS 28: Scalable Communication-Avoiding and -Hiding Krylov Subspace Methods I

Orthogonal Eigenvectors and Gram-Schmidt

Solving large sparse Ax = b.

From Stationary Methods to Krylov Subspaces

An Explanation of the MRRR algorithm to compute eigenvectors of symmetric tridiagonal matrices

ANY FINITE CONVERGENCE CURVE IS POSSIBLE IN THE INITIAL ITERATIONS OF RESTARTED FOM

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems

ERROR ESTIMATION IN PRECONDITIONED CONJUGATE GRADIENTS

Algebra C Numerical Linear Algebra Sample Exam Problems

1 Extrapolation: A Hint of Things to Come

Iterative Methods for Solving A x = b

Error Estimation in Preconditioned Conjugate Gradients. September 15, 2004

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.

RITZ VALUE BOUNDS THAT EXPLOIT QUASI-SPARSITY

A Cholesky LR algorithm for the positive definite symmetric diagonal-plus-semiseparable eigenproblem

Bounding the Spectrum of Large Hermitian Matrices

Math 411 Preliminaries

On solving linear systems arising from Shishkin mesh discretizations

Ritz Value Bounds That Exploit Quasi-Sparsity

Lecture # 20 The Preconditioned Conjugate Gradient Method

Algebraic Multigrid as Solvers and as Preconditioner

7.3 The Jacobi and Gauss-Siedel Iterative Techniques. Problem: To solve Ax = b for A R n n. Methodology: Iteratively approximate solution x. No GEPP.

Transcription:

Sensitivity of Gauss-Christoffel quadrature and sensitivity of Jacobi matrices to small changes of spectral data Zdeněk Strakoš Academy of Sciences and Charles University, Prague http://www.cs.cas.cz/ strakos A joint work with Diane P. O Leary and Petr Tichý Zeuthen, June 2008.

Jacobi matrices T N γ 1 δ 2 δ 2 γ 2......... δ N, δ l > 0. δ N γ N Fully determined by the spectrum and the top (bottom) entries of the normalized eigenvectors. Are the entries sensitive to small changes of the spectral data? Z. Strakoš 2

Outline 1. Lanczos, CG and the Gauss-Christoffel quadrature 2. Reconstruction of Jacobi matrices from the spectral data 3. Another problem 4. Gauss-Christoffel quadrature can be sensitive to small perturbations of the distribution function 5. Lanczos and CG in finite precision arithmetic Z. Strakoš 3

1 : Lanczos tridiagonalization (1950, 1952) A R N,N, large and sparse, SPD, w 1 ( r 0 / r 0, r 0 b Ax 0 ), AW n = W n T n + δ n+1 w n+1 e T n, W T n W n = I, W T n w n+1 = 0, n = 1, 2,..., T n γ 1 δ 2 δ 2 γ 2......... δ n, δ l > 0. δ n γ n Consider, for clarity, that the process does not stop until n = N. Z. Strakoš 4

1 : Lanczos, CG and Gauss-Ch. quadrature Ax = b, x 0 ξ ζ f(λ) dω(λ) T n y n = r 0 e 1 x n = x 0 + W n y n n i=1 ( ω (n) i f θ (n) i ) ω (n) (λ) ω(λ) Z. Strakoš 5

1 : Literature (beginning) Sensitivity of the map from the nodes and weights of the computed quadrature to the recurrence coefficients of the corresponding orthonormal polynomials: Gautschi (1968, 1970, 1978, 1982, 2004), Nevai (1979), H. J. Fischer (1998), Elhay, Golub, Kautsky (1991, 1992), Beckermann and Bourreau (1998), Laurie (1999, 2001). Z. Strakoš 6

1 : Sensitivity of the Lanczos recurrences A, E R N,N diagonal SPD, A, w 1 T n T N = W T N AW N A + E, w 1 + e T n T N = W T N (A + E) W N Strongly nonlinear relationship T N (A, w 1 ) = W T N (A, w 1) A W N (A, w 1 ). T N has all its eigenvalues close to that of A. Is T n for sufficiently small perturbations of A, w 1 close to T n? Z. Strakoš 7

1 : Literature (continuation) Sensitivity of the Lanczos recurrences: Gelfand and Levitan (1951), Burridge (1980), Natterer (1989), Xu (1993), Druskin, Borcea and Knizhnermann (2005), Carpraux, Godunov and Kuznetsov (1996), Kuznetsov (1997), Paige and van Dooren (1999); Here, however, sensitivity of Krylov subspaces has to be investigated as a part of the problem! Z. Strakoš 8

Outline 1. Lanczos, CG and the Gauss-Christoffel quadrature 2. Reconstruction of Jacobi matrices from the spectral data 3. Another problem 4. Gauss-Christoffel quadrature can be sensitive to small perturbations of the distribution function 5. Lanczos and CG in finite precision arithmetic Z. Strakoš 9

2 : Literature (continuation) Computation of the inverse eigenvalue problem - reconstruction of T N from the nodes and weights: Stieltjes (1884), de Boor and Golub (1978), Gautschi (1982, 1983, 2004, 2005), Gragg and Harrod (1984), Boley and Golub (1987), Reichel (1991), H. J. Fischer (1998), Rutishauser (1957, 1963, 1990), Fernando and Parlett (1994), Parlett (1995), Parlett and Dhillon (97), Laurie (99, 01); Z. Strakoš 10

2 : Literature (final part) Computation of nodes (eigenvalues) and weights (squared top elements of the normalized eigenvectors) from T n : Wilkinson (1965), Kahan (19??), Demmel and Kahan (1990), Demmel, Gu, Eisenstat, Slapničar, Veselič and Drmač (1999), Dhillon (1997), Li (1997), Parlett and Dhillon (2000), Laurie (2000), Dhillon and Parlett (2003, 2004), Dopico, Molera and Moro (2003), Grosser and Lang (2005), Willems, Lang and Vömel (2005). Some summary in Meurant and S (2006), O Leary, S and Tichý (2007). Z. Strakoš 11

2 : Accurate recovery of recursion coefficients Laurie (99): A constructive proof of the following statement: Given the weights and the N 1 positive differences between the consecutive nodes, the main diagonal entries of the corresponding Jacobi matrix (shifted by the smallest node) and the off-diagonal entries can be computed in 9 2 N2 + O(N) arithmetic operations, all of which can involve only addition, multiplication and division of positive numbers. Consequently, in finite precision arithmetic they can be computed to a relative accuracy no worse than 9 2 N2 ε + O(Nε), where ε denotes machine precision. Z. Strakoš 12

2 : Sensitivity result Laurie (99, 01): This result bounds also the conditioning of the problem: If the weights and the N 1 positive differences between the consecutive nodes are perturbed, with the size of the relative perturbations of the individual entries bounded by some small ǫ, then such perturbation can cause a relative change of the individual entries of the shifted main diagonal and of the individual off-diagonal entries of the Jacobi matrix not larger than 9 2 N2 ǫ + O(Nǫ). The resulting algorithm combines ideas from earlier works from approximation theory, orthogonal polynomials, and numerical linear algebra. Z. Strakoš 13

Outline 1. Lanczos, CG and the Gauss-Christoffel quadrature 2. Reconstruction of Jacobi matrices from the spectral data 3. Another problem 4. Gauss-Christoffel quadrature can be sensitive to small perturbations of the distribution function 5. Lanczos and CG in finite precision arithmetic Z. Strakoš 14

3 : An example basic problem A R N N is diagonal positive definite (SPD), S (91), Greenbaum, S (92), λ i = λ 1 + i 1 n 1 (λ n λ 1 ) γ n i, i = 2,...,n 1, In the experiment we take λ 1 = 0.1, λ n = 100, n = 24, γ = 0.55. Starting vector w 1 R N has been generated randomly. Lanczos process: A, w 1 T n T N = W T N A W N Z. Strakoš 15

3 : A particular larger problem  R 2N 2N diagonal SPD, ŵ 1 R 2N, obtained by replacing each eigenvalue of A by a pair of very close eigenvalues of  sharing the weight of the original eigenvalue. In terms of the distribution functions, ˆω(λ) has doubled points of increase but it is very close to ω(λ). Â, ŵ 1 ˆT n ˆT 2N = Ŵ T 2N ÂŴ2N ˆT 2N has all its eigenvalues close to those of A. However, ˆTn can be for n N very different from T n. Z. Strakoš 16

Exact arithmetic! Z. Strakoš 17

3 : Lanczos results for A,w 1 and Â,ŵ 1 : 10 2 10 2 10 1 10 1 10 0 10 1 10 0 10 2 5 10 15 iteration n 10 1 5 10 15 iteration n 10 2 +2ζ 0.25 10 2 10 2 2ζ 5 10 15 iteration n 0.1 5 10 15 iteration n Z. Strakoš 18

3 : CG results for A,w 1 and Â,ŵ 1 : 10 0 10 5 x x n A 2 perturbed problem x x n A 2 original problem 10 10 0 5 10 15 20 iteration n 10 0 difference nth error A 2 difference initial error A 2 10 5 10 10 0 5 10 15 20 iteration n Z. Strakoš 19

3 : Observations Replacing single eigenvalues by two close ones causes large delays. The presence of close eigenvalues causes an irregular staircase-like behaviour. Local decrease of error says nothing about the total error. Stopping criteria must be based on the global information. Z. Strakoš 20

3 : Ritz values in the presence of close eig-s In the presence of very close eigenvalues, a Ritz value in the exact Lanczos or CG method initially converges to the cluster as fast as if the cluster were replaced by a single eigenvalue with the combined weight. Within a few further steps it converges very fast to one of the eigenvalues, with another Ritz value converging simultaneously to approximate the rest of the cluster. In the presence of more than two eigenvalues in a cluster, the story repeats until all eigenvalues in a cluster are approximated by individual Ritz values. The additional Ritz values in the clusters are, however missing in the other part of the spectrum, and the convergence of CG is delayed, in comparison to the single eigenvalues case, by the number of steps needed to provide the missing Ritz values. Z. Strakoš 21

3 : Published explanations The fact that the presence of close eigenvalues affects the convergence of Ritz values and therefore the rate of convergence of the conjugate gradient method is well known; see the beautiful explanation given by van der Sluis and van der Vorst (1986, 1987). It is closely related to the convergence of the Rayleigh quotient in the power method and to the so-called misconvergence phenomenon in the Lanczos method, see O Leary, Stewart and Vandergraft (1979), Parlett, Simon and Stringer (1982). Z. Strakoš 22

3 : Caution Kratzer, Parter and Steuerwalt, Block splittings for the conjugate gradient method, Computers and Fluids 11, (1983), pp. 255-279. The statement on p. 261, second paragraph, in our notation says: The convergence of CG for A, w 1 and at least ˆx ˆx N  should be small. Â, ŵ 1 ought to be similar; Similar statements can be found in several later papers and some books. The arguments are based on relating the CG minimizing polynomial to the minimal polynomial of A. For some distribution of eigenvalues of A, however, its minimal polynomial (normalized to one at zero) can have extremely large gradients and therefore it can be very large at points even very close to its roots (here at the eigenvalues of  ). Z. Strakoš 23

3 : CG results for A,w 1 and Â,ŵ 1, γ = 0.8 : 10 0 10 5 10 10 x x n A 2 perturbed problem x x n A 2 original problem 0 5 10 15 20 25 30 iteration n 10 0 difference nth error A 2 difference initial error A 2 10 5 10 10 0 5 10 15 20 25 30 iteration n Z. Strakoš 24

Outline 1. Lanczos, CG and the Gauss-Christoffel quadrature 2. Reconstruction of Jacobi matrices from the spectral data 3. Another problem 4. Gauss-Christoffel quadrature can be sensitive to small perturbations of the distribution function 5. Lanczos and CG in finite precision arithmetic Z. Strakoš 25

4 : CG and Gauss-Ch. quadrature errors At any iteration step n, CG represents the matrix formulation of the n-point Gauss quadrature of the R-S integral determined by A and r 0, ξ ζ f(λ) dω(λ) = n i=1 ω (n) i f(θ (n) i ) + R n (f). For f(λ) λ 1 the formula takes the form x x 0 2 A r 0 2 = n-th Gauss quadrature + x x n 2 A r 0 2. This was a base for the CG error estimation in [DaGoNa-78, GoFi-93, GoMe-94, GoSt-94, GoMe-97,... ] Z. Strakoš 26

4 : Sensitivity of the Gauss-Ch. Quadrature 10 0 10 5 10 10 quadrature error perturbed integral quadrature error original integral 0 5 10 15 20 iteration n 10 0 difference estimates difference integrals 10 5 10 10 0 5 10 15 20 iteration n Z. Strakoš 27

4 : Simplified problem 10 0 10 10 10 2 quadrature error original integral quadrature error perturbed integral (2) quadrature error perturbed integral (4) 0 2 4 6 8 10 12 14 16 18 20 iteration n 10 0 10 2 10 4 β s original integral β s perturbed integral (2) β s perturbed integral (4) 0 2 4 6 8 10 12 14 16 18 20 iteration n Z. Strakoš 28

4 : Theorem - O Leary, S, Tichý (2007) Consider distribution functions ω(x) and ω(x) on [a, b]. Let p n (x) = (x x 1 )...(x x n ) and p n (x) = (x x 1 )...(x x n ) be the nth orthogonal polynomials corresponding to ω and ω respectively, with ˆp s (x) = (x ξ 1 )...(x ξ s ) their least common multiple. If f is continuous on [a, b], then the difference n ω, ω between the approximation Iω n to I ω and the approximation Iω ñ to I ω, obtained from the k-point Gauss-Christoffel quadrature, is bounded as n ω, ω + b a b a ˆp s (x)f[ξ 1,...,ξ s, x] dω(x) f(x) dω(x) b a b a f(x) d ω(x). ˆp s (x)f[ξ 1,...,ξ s, x] d ω(x) Z. Strakoš 29

4 : Modified moments do not help 10 20 GM n (Ω 0, Ω 1 ) 10 25 GM n (Ω 0, Ω 2 ) 10 15 MM n (Ω 0, Ω 1 ) 10 20 MM n (Ω 0, Ω 2 ) 10 15 10 10 10 10 10 5 10 5 10 0 0 5 10 15 20 25 30 35 iteration n 10 0 0 5 10 15 20 25 30 35 iteration n Condition numbers of the matrix of the modified moments (GM) and the matrix of the mixed moments (MM). Left - enlarged supports, right - shifted supports. Z. Strakoš 30

4 : Summary 1. Gauss-Christoffel quadrature for a small number of quadrature nodes can be highly sensitive to small changes in the distribution function. In particular, the difference between the corresponding quadrature approximations (using the same number of quadrature nodes) can be many orders of magnitude larger than the difference between the integrals being approximated. 2. This sensitivity in Gauss-Christoffel quadrature can be observed for discontinuous, continuous, and even analytic distribution functions, and for analytic integrands uncorrelated with changes in the distribution functions and with no singularity close to the interval of integration. Z. Strakoš 31

Outline 1. Lanczos, CG and the Gauss-Christoffel quadrature 2. Reconstruction of Jacobi matrices from the spectral data 3. Another problem 4. Gauss-Christoffel quadrature can be sensitive to small perturbations of the distribution function 5. Lanczos and CG in finite precision arithmetic Z. Strakoš 32

4 : CG applied to the basic problem 10 0 10 10 squared energy norm FP CG squared energy norm exact CG 0 5 10 15 20 iteration n 10 0 difference 10 10 0 5 10 15 20 iteration n Z. Strakoš 33

4 : Observations - FP CG Rounding errors can cause large delays. They may cause an irregular staircase-like behaviour. Local decrease of error says nothing about the total error. Stopping criteria must be based on global information. It must be justified by rigorous rounding error analysis. Golub and S (1994), S and Tichý (2002, 2005), Comput. Methods Appl. Mech. Engrg. (2003). Z. Strakoš 34

4 : Close to the exact CG for  ˆx = ˆb??? Finite precision Lanczos and CG computations as exact computations for a different problem Paige (71 80), Greenbaum (89). Parlett, Scott, Simon, Grcar, Cullum, S, Golub, Notay, Druskin, Knizhnerman, Meurant, Tichý, Wúlling, Zemke... Recent review and update Meurant and S, Acta Numerica, (06), Meurant (06). Z. Strakoš 35

Conclusions It is good to look for interdisciplinary links and for different lines of thought. Such as linking the Krylov subspace methods with model reduction and matching moments. Rounding error analysis of Krylov subspace methods has had unexpected side effects such as understanding of general mathematical phenomena independent of any numerical stability issues. Analysis of Krylov subspace methods for solving linear problems has to deal with highly nonlinear finite dimensional phenomena. The pieces of the mosaic fit together. Z. Strakoš 36

Thank you! Z. Strakoš 37