Eigenvalue Estimation with the Rayleigh-Ritz and Lanczos methods

Size: px
Start display at page:

Download "Eigenvalue Estimation with the Rayleigh-Ritz and Lanczos methods"

Transcription

1 Eigenvalue Estimation with the Rayleigh-Ritz and Lanczos methods Ivo Panayotov Department of Mathematics and Statistics, McGill University, Montréal Québec, Canada August, 2010 A thesis submitted to the Faculty of Graduate Studies and Research in partial fulfillment of the requirements of the degree of Doctor of Philosophy Copyright c Ivo Panayotov, 2010

2

3 i Abstract In this thesis we study two different problems related to eigenvalue error bounds. In the first part of our thesis, we examine a conjecture of Knyazev and Argentati [Siam J. Matrix Anal. Appl., 29 (2006), pp ] bounding the difference between Ritz values of a Hermitian matrix A for two subspaces, one of which is A-invariant. We provide a proof for a slightly weaker version of the conjecture, and discuss the recently published full proof. Moreover we give implications of the now proven bound and examine how it compares to a classical bound in the same context. In the second part of our thesis, we derive some properties of complex Hessenberg matrices and consider the relevant normal matrix cases of these to re-examine the lengths of the Ritz vectors in the rounding error analysis of the Lanczos process for tridiagonalizing certain normal matrices. This question has already been studied for the real symmetric case, but part of that answer has never been published in scientific journals, and in that case we give new theory. For the more general normal matrix cases we develop applicable theory including some new tight bounds.

4 ii Abstract

5 iii Résumé Dans cette thèse, nous étudions deux problèmes différents liés aux bornes d erreur de valeurs propres. Dans la première partie de notre thèse, nous examinons une conjecture de Knyazev et Argentati [Siam J. Matrix Anal. Appl., 29 (2006), pp ] bornant la différence entre les valeurs Ritz d une matrice hermitienne A correspondant à deux sous-espaces, dont l un est A-invariant. Nous obtenons une preuve pour une version légèrement plus faible de la conjecture, et discutons la preuve complète publiée récemment. De plus, nous dérivons certaines implications de la borne, maintenant prouvée, et la comparons à une borne classique dans le même contexte. Dans la deuxième partie de notre thèse, nous dérivons certaines propriétés des matrices Hessenberg complexes et examinons les cas appropriés normaux de celles-ci afin de réexaminer les longueurs des vecteurs Ritz dans l analyse d erreur du processus de Lanczos de tridiagonalization de certaines matrices normales. Cette question a déjà été étudiée pour le cas symétrique réel, mais une partie de l analyse n a jamais été publiée dans une revue scientifique; dans ce cas, nous présentons une nouvelle théorie. Pour le cas plus général de matrices normales, nous développons une théorie applicable, avec de nouvelles bornes serrées.

6 iv Résumé

7 v Acknowledgments I would like to thank my supervisor Chris Paige for his continued guidance, support, and encouragement throughout my doctoral studies. I am grateful for the incredible amount of time he spent with me, both academically and otherwise. I would also like to thank my supervisor Xiao-Wen Chang. Although I did not work with him directly, I learned a lot from his two superb courses in matrix computations, and from his sharp and always to the point questions. I express my deepest gratitude to both for taking me as their student five years ago and for the great commitment that they have shown. I am grateful to David Airapetyan for taking the time to discuss my research problems and for helping me clarify my own direction, to Felicia Magpantay for enthusiastically performing Matlab tests with me on my own problems, and to Layan El Hajj, Svetla Vassileva and Jeremy Macdonald for fruitful discussions. My thanks go to all my friends in the mathematics department for making my stay at McGill particularly pleasant. I am very fortunate and grateful to have received funding from FQRNT of Québec Scholarship and Chris Paige s NSERC of Canada Grant OGP , making it possible for me to attend conferences and to focus on my studies without financial worries.

8 vi Acknowledgments

9 vii Table of Contents Abstract i Résumé iii Acknowledgments v Introduction 1 1 Majorization Bounds for Ritz Values of Hermitian Matrices Ritz Values in Eigenvalue Approximation Classical Bounds for Ritz Values Definitions and Prerequisites Notation Angles between Subspaces Majorization Majorization Bounds for Ritz Values Discussion Full Proof of the Main Conjecture Normwise Implications Majorization versus Classical Bounds Concluding Remarks

10 viii TABLE OF CONTENTS 2 Hessenberg Matrix Properties Notation Hessenberg Matrix Properties A Side Implication Concluding Remarks Ritz vectors in the Lanczos process The Lanczos Process Implementation of the Lanczos Process Approximating Eigenvalues from the Lanczos Process The Lanczos Process in Finite Precision Our Current Approach for Studying z m Bound for an Isolated Ritz Value Divided Differences Bound for a Cluster of Ritz Values Comparing the Old and New Approaches Concluding Remarks Conclusion 103

11 1 Introduction Objectives of the Research Eigenvalue problems appear in many applications. For example, frequencies of vibration in mechanical systems can be found by solving eigenproblems, while the energy levels of a system are the eigenvalues of the Hamiltonian in quantum mechanics. Eigenvalue methods are used today in these and many other applications, including spectral data clustering and internet search engines. Eigenvalues cannot be computed exactly except in trivial cases, so they are usually approximated numerically. Eigenvalue a posteriori and a priori error bounds describe the quality of eigenvalue approximations, and such error bounds are a classical and important topic in matrix analysis. A posteriori bounds are based on information readily available when running an algorithm, e.g., the eigenvector residual. They are used for practical estimation of, or bounds on, the eigenvalue error from the algorithm. A priori bounds are based on information not readily available during an algorithm run. Nevertheless, they help our understanding of algorithms, and, as Wilkinson [59, p. 166] pointed out, a priori bounds are of great value in assessing the relative performance of algorithms. Thus a priori bounds are important in theory and practice. In this thesis we examine two different questions related to eigenvalues: the first is an a priori eigenvalue error bound, and the second is related to an a posteriori eigenvalue error bound.

12 2 Introduction In the first problem, we examine a conjecture of Knyazev and Argentati [32] on the difference between Ritz values of a Hermitian matrix A for two subspaces, one of which is A-invariant. This conjecture generalizes a classical one-dimensional bound of Ruhe [53] to multidimensional subspaces using majorization. We show here, and in [2], that the conjectured bound holds under some additional assumptions, and a slightly weaker version of it holds in the general case. We also present recent work of Knyazev and Argentati [33] proving the conjectured bound in all cases. We then examine some of the consequences of the conjecture, now a theorem, and compare the new majorization bound to a multidimensional version of the classical one, see [49]. In the second problem, we develop properties of Hessenberg matrices some of which we believe are new and useful for analysis of algorithms. We then use these properties to re-examine the lengths of the Ritz vectors in the finite arithmetic Lanczos process for the Hermitian eigenvalue problem, which is an important question in the rounding error analysis of this process. Our analysis is also valid for the Lanczos process adapted to skew-hermitian and normal matrices with collinear eigenvalues, that is, eigenvalues which lie on a line segment in C, see [50]. Our thesis is outlined as follows. In Chapter 1 we examine the conjecture posed by Knyazev and Argentati in [32], in Chapter 2 we develop properties of Hessenberg matrices and in Chapter 3 we provide the new analysis for the lengths of the Ritz vectors in the Lanczos process for Hermitian, skew-hermitian and normal matrices with collinear eigenvalues. A Brief Review and Introduction to the Literature Here we mention some background, together with some key literature. The full relevance of the literature to the present work is given in the main text.

13 3 Our first major topic is the examination of a conjecture posed by Knyazev and Argentati in [32], which generalizes a classical one-dimensional result on the Rayleigh quotient by Ruhe [53]. This conjecture of Knyazev and Argentati provides a link between two important topics in linear algebra: majorization, and the Rayleigh-Ritz method for the Hermitian eigenvalue problem. Majorization is a classical topic in theoretical linear algebra and is particularly important in matrix perturbation theory. Majorization (weak or strong) is a type of inequality (comparison) relation between two real vectors which is differently defined from the usual componentwise or normwise inequality. Majorization inequalities arise naturally, e.g., when describing the spectrum or singular values of sums or products of matrices. They are important because in many circumstances these inequalities describe much more precisely the eigenvalue or singular value relationships between matrices than what is achievable with componentwise or normwise inequalities. Majorization is a well developed field in theoretical linear algebra, see, e.g., [7, 41, 24]. The Rayleigh-Ritz method for the eigenvalue problem is a classical topic in computational linear algebra. Although it is customary to say the Rayleigh-Ritz method, it is not an actual algorithm in itself; it is more a technique for obtaining eigenvalues when some algorithm, such as some iteration, produces approximations to eigenvectors. The Rayleigh-Ritz approach generalizes the Rayleigh quotient. If A is an n n Hermitian matrix and y is a unit vector, presumably obtained from some algorithm, we can obtain an approximation to an eigenvalue of A based on this trial vector y by forming the Rayleigh quotient y H Ay. If y is close to an eigenvector, the approximation will be good, as we shall see, coinciding exactly with an eigenvalue of A if y is an eigenvector. The idea behind the Rayleigh-Ritz method is essentially the same, but now applied to subspaces. Suppose Y is a k-dimensional subspace of C n with a unitary basis Y, presumably obtained from some algorithm, then one can obtain an approximation for a k-block of eigenvalues of A by computing the eigenvalues of

14 4 Introduction Y H AY. There is a vast literature on Rayleigh-Ritz eigenvalue methods and error bounds; see, e.g., [52, Chapters 11 13], [58, Chapters 3 5], and [35, Chapter 4]. Despite the fact that majorization and the Rayleigh-Ritz method are two very important topics, the former in theoretical, and the latter in computational linear algebra, there are not many links between them. We are aware of very few instances where majorization has been used in the context of the Rayleigh-Ritz method. The first such instance that we are aware of is the celebrated work of Davis and Kahan [11] to bound eigenvalue errors a posteriori. Recently, Knyazev and Argentati [31, 32] have established what appear to be the first a priori eigenvalue error bounds using majorization in the context of the Rayleigh-Ritz method, stating, in [32], the conjecture which constitutes the first major part of our work. Our second major topic is a derivation of Hessenberg matrix properties and an application of these in the re-examination of the lengths of the Ritz vectors in the rounding error analysis of the Lanczos process. Hessenberg matrices are important in computational linear algebra because they are almost upper triangular and are obtained in intermediate steps in many algorithms for the eigenvalue problem such as QR, Lanczos, Arnoldi, etc., see [17, 7, 9]. Recently Zemke [62] has introduced polynomial vectors associated with a Hessenberg matrix which interpolate the eigenvectors and which appear to be very useful in the analysis of the properties of such matrices. In [63] Zemke uses these polynomial vectors in an analysis of perturbed Krylov subspace methods. Here we provide further use of these polynomial vectors in the context of a rounding error analysis for the Lanczos process for the eigenvalue problem, see [50]. We now briefly introduce the Hermitian Lanczos process. Let A be an n n Hermitian matrix. At step k, the Lanczos process for tridiagonalizing a Hermitian matrix, see, e.g., [17, Chapter 9], in theory produces V k C n k, T k R k k, v k+1 C n,

15 5 β k+1 R, such that AV k = V k T k + v k+1 β k+1 e T k, where [V k, v k+1 ] has orthonormal columns, β j > 0, j = 1,...,k, e T k is the k-th row vector of the identity matrix, and T k is real symmetric. The above process was introduced by Cornelius Lanczos in [36] for solving eigenvalue problems, and later in [37], for solving linear systems of equations. The idea is that the matrix T k above is the restriction of the operator A onto the subspace of C n spanned by the columns of V k. In the context of the eigenvalue problem for A, since T k has a very simple tridiagonal structure, one may solve for its eigenvalues very quickly and reliably, see, e.g., [52], [17, 8], and use the results to approximate the eigenvalues of A. If the range of V k is an approximate invariant subspace of A, the eigenvalues of T k will be good approximations to those of A. The Lanczos process applied to the eigenvalue problem is a particular instance of a Rayleigh-Ritz method since T k = V H k AV k. This algorithm is particularly useful for large sparse matrix computations since T k can be computed very efficiently from a three term recurrence using matrix-vector products, which are performed quickly due to the sparsity of A. In exact arithmetic V k has orthonormal columns; in finite precision arithmetic, however, the columns of V k can quickly lose orthogonality. For this reason, together with the advent of a backward stable tridiagonalization algorithm produced by Wallace Givens in 1954 [15], the Lanczos algorithm was dismissed soon after its appearance. It was brought back to life by the work of Paige in the seventies who performed a rounding error analysis of the real symmetric Lanczos process in [44], and later in [45, 46, 47], and showed that despite its departure from theory the algorithm is nevertheless extremely accurate, and very useful for finding eigenvalues and eigenvectors of large sparse symmetric matrices. Later, the understanding of the rounding

16 6 Introduction error behaviour of this Lanczos process was developed further in the works of Parlett, Greenbaum, Strakoš and others, see, for example, [52, 18, 19, 57, 26, 60, 61]. For an overview of this algorithm, its history, as well as for an extensive bibliography, we refer the reader to the book by Meurant [42]. Variants of the Lanczos process that will be dealt with here are also applicable to skew-hermitian and normal matrices with collinear eigenvalues, see [14, 39]. Ideas here might also be applicable to some variant of the Lanczos unsymmetric matrix tridiagonalization process in [36], see also for example [59, 35 40, pp ]. They could possibly be used to develop the work initiated by Bai [5] on the rounding error analysis of this unsymmetric process. Contributions of the Author In the research for this thesis, the author has worked very closely with Chris Paige, one of his supervisors; thus the work in this thesis is essentially joint work, since most of the results were developed after many discussions and exchanges of ideas. Nevertheless, here we attempt to describe, to the best of our knowledge, the original results in which the author s contributions were particularly significant. In Chapter 1 we examine a conjecture of Knyazev and Argentati [32] on the difference between Ritz values of a Hermitian matrix A for two subspaces, one of which is A-invariant. The author has made significant contributions towards the proofs of Theorem 1.14 and its Corollary 1.15, showing that a slightly weaker version of the conjecture holds in the general case. Moreover, the author has largely contributed to the examples in Section 1.5 demonstrating that on the one hand the conjectured bound is sharp, and on the other, that our approach cannot be improved to prove the full conjecture in all cases. Concerning the implications of the conjectured bound, now a theorem, the author has largely contributed in the proofs of Corollary 1.21, which provides normwise implications, and Corollary 1.23 which compares the conjectured

17 7 bound to a multidimensional version of the classical one. The work on the proof of the slightly weaker version of the conjecture is original and was published in [2]. Notice that this paper is joint work with two other coauthors: Argentati and Knyazev. Initially the author of this thesis and his supervisor had submitted independent work on the proof of the conjecture, and naturally enough, Knyazev and Argentati were among the referees. They communicated to the editor that they were also working on that topic, so we suggested to the editor and to them that we rewrite the paper with them. The outcome of our combined ideas and interaction is [2]. The later work on the consequences of the conjecture and its comparison to the classical bound is also original, and was published in [49]. In Chapters 2 and 3 we derive properties of Hessenberg matrices, and then use these to re-examine the lengths of the Ritz vectors in the finite precision Lanczos process for tridiagonalizing certain normal matrices. In this part the author has jointly contributed to some of the lemmas in Chapter 2, although it is difficult to assert their originality or importance. The results presented in these lemmas are mostly used as tools for later analysis. In Chapter 3, the author has somewhat contributed towards Theorem 3.3, which is original and which is the basic building block of our analysis; however the statement of this theorem in its current form is due to Chris Paige. The author realized that Lemma 3.5 was the key tool to use in obtaining the later bounds, and made significant contributions towards Theorems 3.6, 3.7, 3.8, and 3.13 as well as their corollaries 3.9 and 3.15, which provide new analysis and bounds for the lengths of the Ritz vectors in the Lanczos process for Hermitian matrices. In particular he realized the relationship with divided differences, see section 3.7, and their effectiveness in handling the case of a cluster of eigenvalues. With this insight he developed all the new theory for handling such clusters, leading to the entirety of section 3.8. The bounds derived are essentially the same as the already existing bounds for the real symmetric application of the Lanczos process, but there is some

18 8 Introduction hope that they might be improved in the future. However the analysis provided is more straightforward, hopefully simpler, and allows for a comprehensive treatment of the case of a cluster of Ritz values, which is part of the old analysis that has never been published in a scientific journal. Also our new approach is more general in that it applies directly to other versions of the Lanczos process, namely for Hermitian, skew- Hermitian, and normal matrices with collinear eigenvalues. The theorems and analysis in Chapters 2 and 3 (except where stated otherwise) are original work submitted for publication in [50].

19 9 Chapter 1 Majorization Bounds for Ritz Values of Hermitian Matrices 1.1 Ritz Values in Eigenvalue Approximation The Rayleigh-Ritz method is a classical technique in numerical linear algebra for approximating eigenvalues of Hermitian matrices. Although we use the word method in the description, it is not an actual algorithm in itself, but rather a technique for obtaining eigenvalues when some algorithm produces approximations to eigenvectors. Let A be an n n Hermitian matrix and suppose we are given a unit vector y, presumably obtained from some algorithm. We can approximate an eigenvalue of A based on this trial vector y from the Rayleigh quotient y H Ay. The Rayleigh-Ritz method is simply the Rayleigh quotient applied to subspaces. If Y is a k-dimensional subspace of C n with a unitary basis Y, presumably obtained from some algorithm, the Rayleigh-Ritz method approximates k eigenvalues of A by computing the eigenvalues of Y H AY. These are also called Ritz values of A corresponding to the trial subspace Y. If Y is A-invariant the Ritz values will be exact eigenvalues of A. In this chapter we examine a conjecture posed in [32] relating to the quality of eigenvalue approximation

20 10 Majorization Bounds for Ritz Values of Hermitian Matrices of the Rayleigh-Ritz method when the trial subspace is close to A-invariant. This conjecture generalizes a classical one-dimensional result on the Rayleigh quotient by Ruhe [53]. 1.2 Classical Bounds for Ritz Values Let x, y C n with x = y = 1, where denotes the usual vector (or induced matrix) two-norm, and let A = A H C n n. Write spr(a) λ max (A) λ min (A), θ(x, y) arccos x H y [0, π/2], (1.1) θ(x, y) being the acute angle between x and y. Here spr(a) denotes the spread of the eigenvalues of A, i.e. the length of the smallest interval containing the eigenvalues of A; these are all real because A is Hermitian. If we know x, y, and spr(a), we may bound the difference in the Rayleigh quotients using θ(x, y) as follows. Theorem 1.1 ([31, Theorem 1]). Let x, y C n, x = y = 1, A = A H C n n, and let θ(x, y) be as in (1.1), then x H Ax y H Ay spr(a) sinθ(x, y). (1.2) Proof. [31, Theorem 1] gave a proof in R n, saying (1.2) also holds in C n. We give a proof in C n. Clearly (1.2) holds if x H y = 0, so from now on assume x H y 0. The values of the left and right hand sides of (1.2) are unchanged when shifting A to A + γi, so we may assume without loss of generality that the spectrum of A is centered at zero, i.e. A = λ max (A) and spr(a) = 2 A. Also (1.1) and (1.2) are unaltered if x is multiplied by a scalar α with α = 1, so without loss of generality we can replace x by x(x H y)/ x H y, giving for the new x, y H x = x H y = cosθ(x, y) > 0.

21 1.2 Classical Bounds for Ritz Values 11 Now x H Ax y H Ay is real and x H Ay y H Ax is imaginary, so that x H Ax y H Ay 2 x H Ax + x H Ay y H Ax y H Ay 2 = (x y) H A(x + y) 2 A 2 x y 2 x + y 2 Taking square roots completes the proof. = spr(a) 2[2 (xh y + y H x)][2 + (x H y + y H x)] 4 = spr(a) 2 (1 cos 2 θ(x, y)) = spr(a) 2 sin 2 θ(x, y). The above result relates the difference of Rayleigh quotients to the angle between the (arbitrary) unit vectors x and y. An important special case, both theoretically and practically, occurs when one of the vectors, say x, is an eigenvector of A and y is an approximation to x, often obtained from a numerical method. In this case, x H Ax is an eigenvalue of A and y H Ay is an approximation to that eigenvalue. The classic result that motivates our research is the following: the Rayleigh quotient approximates an eigenvalue of a Hermitian matrix with accuracy proportional to the square of the eigenvector approximation error. The following result was proven by Ruhe [53]. Theorem 1.2 ([53, p. 146]). With the notation in Theorem 1.1, if Ax = xλ then λ y H Ay = x H Ax y H Ay spr(a) sin 2 θ(x, y). (1.3) Proof. We give the proof in [2, p. 551]. Here x H Ax = λ. Let y = u + v where u span{x} and v (span{x}). Then (A λi)u = 0 and v = sin θ(x, y), giving x H Ax y H Ay = λ y H Ay = y H (A λi)y = v H (A λi)v (1.4) A λi v 2 = A λi sin 2 θ(x, y) spr(a) sin 2 θ(x, y).

22 12 Majorization Bounds for Ritz Values of Hermitian Matrices Remark 1.1. The fact that λ x H Ax is an eigenvalue of A is used in the third equality in (1.4). If λ is not an eigenvalue of A the proof fails because other non-zero terms appear in the expansion of y H (A λi)y. Suppose that sin θ(x, y) is our measure of the error between the approximate eigenvector y and the true eigenvector x. Then from (1.3) the eigenvalue approximation error λ y H Ay is at worst proportional to the square of the eigenvector approximation error. Therefore (1.3) is a big improvement over the more general (1.2). It is important to realize that these bounds depend on the theoretical quantity θ(x, y) which will not usually be computed, and so these are a priori results. Such results help our understanding rather than produce computationally useful a posteriori results. As Wilkinson [59, p. 166] pointed out, a priori bounds are of great value in assessing the relative performance of algorithms. Thus while (1.3) is very interesting in its own right depending on sin 2 θ(x, y) rather than sin θ(x, y) it could also be useful for assessing the performance of algorithms that iterate vectors y approximating x, in order to also approximate x H Ax. Now suppose an algorithm produces a succession of k-dimensional subspaces Y (j) approximating an invariant subspace X of A. For example the block Lanczos algorithm of Golub and Underwood [16] is a Krylov subspace method which does this. In what ways can we generalize (1.3) to subspaces X and Y with dim X = dim Y = k > 1? In [32] Knyazev and Argentati proved the following theorem generalizing (1.2) to the multidimensional setting. Theorem 1.3 ([32, Theorem 4.2]). Let X, Y be subspaces of C n having the same dimension k, with orthonormal bases given by the columns of the matrices X and Y respectively, and let A C n n be a Hermitian matrix. Then λ(x H AX) λ(y H AY ) w spr(a) sin θ(x, Y). (1.5) Here w denotes the weak submajorization relation, θ(x, Y) denotes the vector

23 1.2 Classical Bounds for Ritz Values 13 of principal angles between subspaces X and Y, and λ(x H AX) and λ(y H AY ) denote the vectors of eigenvalues (taken in non-increasing order) of X H AX and Y H AY. These concepts will be explained in Section 1.3. Moreover, in the case where X is A-invariant, Knyazev and Argentati conjectured that it is natural to expect a much better bound that involves the square of sinθ(x, Y) further indicating that majorization results of this kind are apparently not known in the literature, see [32, p. 27]. In light of the classical result (1.3) we make the conjecture precise as follows: Conjecture 1.1 ([2, Conjecture 3.1]). Let X, Y be subspaces of C n having the same dimension k, with orthonormal bases given by the columns of the matrices X and Y respectively. Also, let A C n n be a Hermitian matrix, and X be A-invariant. Then λ(x H AX) λ(y H AY ) w spr(a) sin 2 θ(x, Y). (1.6) Relations (1.5) and (1.6) are the respective higher dimensional analogues of the Rayleigh quotient error bounds (1.2) and the classical (1.3), as we will shortly see. In Section 1.4 we provide the following partial answer to Conjecture 1.1: Theorem 1.4 ([2, Theorem 3.1, Corollary 3.3]). Let X, Y be subspaces of C n having the same dimension k, with orthonormal bases given by the columns of the matrices X and Y respectively. Let A C n n be a Hermitian matrix, and let X be A-invariant. Then λ(x H AX) λ(y H AY ) w spr(a) ( sin 2 θ(x, Y) + 1 ) 2 sin4 θ(x, Y). (1.7) Moreover, if the A-invariant subspace X corresponds to the set of k largest or smallest eigenvalues of A, or if all of the eigenvalues of A corresponding to X lie between (and possibly include) one extreme eigenvalue of A and the midpoint of A s spectrum, then λ(x H AX) λ(y H AY ) w spr(a) sin 2 θ(x, Y). (1.8)

24 14 Majorization Bounds for Ritz Values of Hermitian Matrices This is a slightly weaker result than Conjecture 1.1. In numerical analysis we are mainly interested in these results as the angles become small, and then there is minimal difference between the right hand sides of (1.7) and (1.6), so proving the full Conjecture 1.1 is largely of mathematical interest. Although all numerical tests we did leading to [2] suggested that (1.6) was true in all cases, we were unable to prove the full Conjecture. Recently Knyazev and Argentati succeeded in proving (1.6) in all cases, see [33]. Their proof uses many of the same techniques we develop here, although the exact application is a little more sophisticated. Their proof is not part of our research but we nevertheless present it here, in Section 1.6, for completeness. 1.3 Definitions and Prerequisites We introduce the definitions and tools we need, together with some mild motivation. We do not provide proofs for most of the results in this section instead we refer the reader to some of the relevant literature Notation For a real vector x = [x 1,..., x n ] T, we use x [x 1,...,x n] T to denote x with its elements rearranged in non-increasing order, while x [x 1,..., x n ]T denotes x with its elements rearranged in non-decreasing order. We use x to denote the vector of absolute values of the components of x. We use the symbol to compare real vectors componentwise. For real vectors x and y the expression x y means that x is majorized by y, while x w y means that x is weakly submajorized by y, and x w y means that x is weakly supermajorized by y, see Section We consider the Euclidean space C n of column vectors equipped with the standard scalar product x H y and the norm x x H x. We use the same notation A for

25 1.3 Definitions and Prerequisites 15 the induced matrix norm of a complex matrix A C n n. X = R(X) C n means the subspace X is equal to the range of the matrix X with n rows. The unit matrix is I, and the zero matrix (not necessarily square) is 0, while e [1,...,1] T. We use H(n) to denote the set of n n Hermitian matrices and U(n) to denote the set of n n unitary matrices in the set C n n of all n n complex matrices. For a vector x, diag(x) denotes the square matrix with x along its main diagonal and zeros elsewhere; similarly, for a square matrix B, diag of(b) denotes the matrix B with its off-diagonal elements set to zero, while offdiag(b) B diag of(b). We write λ(a) λ (A) for the vector of eigenvalues of A H(n) arranged in descending order, and we write σ(b) σ (B) for the vector of singular values of B arranged in descending order. Individual eigenvalues and singular values are denoted by λ i (A) and σ i (B), respectively, so, e.g., spr(a) = λ 1 (A) λ n (A) and σ 1 (B) = B. Let subspaces X and Y C n have the same dimension, with orthonormal bases given by the columns of the matrices X and Y, respectively. We denote the vector of principal angles between X and Y arranged in descending order by θ(x, Y) θ (X, Y) and define it by using cosθ(x, Y) = σ (X H Y ), see, e.g., [8], [17, ]. We clarify this concept in the next section Angles between Subspaces Let x, y C n, x = y = 1. In (1.1) we have defined the acute angle between x and y via the cosine function, using their inner product. The acute angle provides a measure for the relative positioning of two unit vectors independently of any underlying basis. This notion of relative positioning can also be extended to multidimensional subspaces. Let X, Y be k-dimensional subspaces of C n. One may define a vector θ(x, Y) of k angles completely describing the relative position between these two

26 16 Majorization Bounds for Ritz Values of Hermitian Matrices subspaces as follows, see [8] and [17, ]. Let cosθ k (X, Y) max x X, y Y xh y, x = y = 1. (1.9) This defines the smallest angle θ k between X and Y (giving the largest cosine). In particular if X and Y share a common non-zero vector then θ k (X, Y) = 0. Here we use max rather than sup because the unit ball is compact in these finite dimensional subspaces. So the above maximum is achieved for some x k X and y k Y. Now remove x k from X by considering the orthogonal complement of x k in X and do the same for y k in Y. Repeat the definition (1.9) for the k 1 dimensional subspaces {x X : x x k } and {y Y : y y k }, and then keep going in the same fashion until reaching empty spaces. After completion the above procedure defines recursively the k principal angles 0 θ k (X, Y)... θ 2 (X, Y) θ 1 (X, Y) π 2 between subspaces X and Y. The vectors {x 1, x 2,...,x k } and {y 1, y 2,...,y k } are called principal vectors between the two subspaces. In short we have, cosθ j (X, Y) max x X, y Y xh y, j = k, k 1,..., 1, where (1.10) x = y = 1, x H x i = 0, y H y i = 0, i = k, k 1,...,j + 1. The angles between subspaces are constructed from smallest to largest. Although the construction (1.10) appears slightly awkward, indexed backwards, it is convenient for us to order the angles from largest to smallest, i.e., θ(x, Y) θ (X, Y). In practice one is usually more interested in the larger than the smaller angles, since the larger angles are the ones which give a better idea of how far away the spaces are from each other. The largest angle θ 1 (X, Y) is usually called the gap between X and Y and is sometimes used as a measure of the distance between X and Y. Another

27 1.3 Definitions and Prerequisites 17 widely used measure of distance between X and Y is sin θ 1 (X, Y), the sine of the gap. Of course the full relative positioning between the spaces is described not only by the gap, but by the complete vector of angles between subspaces. One may define principal angles even for subspaces of unequal dimensions in the exact same fashion. In that case the number of angles corresponds to the dimension of the smaller of the two subspaces. Here we only deal with subspaces X and Y of equal dimension. In this case the principal vectors form orthonormal bases for X and Y. These vectors are by no means unique, in fact one has many choices of sets for principal vectors; in particular multiplying a principal vector by a unit-length scalar produces another principal vector, not affecting the other principal vectors in the set. We now show that the angles between subspaces correspond to singular values. Let X and Ỹ be any two orthonormal bases for the k dimensional subspaces X and Y respectively. We can take unitary matrices U and V so that U H XH Ỹ V = diag(σ k,...,σ 1 ), where the singular values are written backwards, i.e., from smallest to largest as we go down the main diagonal. Let X = XU, Y = Ỹ V, then the columns of X and Y are also orthonormal bases for X and Y. Moreover, the columns of X and Y satisfy all the conditions of (1.10), showing that they can be taken as principal vectors between the subspaces X and Y. This also shows that the cosines of the principal angles between X and Y are precisely the singular values of the matrix X H Ỹ. Note here that these singular values are always the same regardless of the initial choice of bases X and Ỹ, that is, the angles depend on the subspaces but not on the choice of bases. Generally we have cos θ(x, Y) = σ (X H Y ), (1.11) X, Y orthonormal bases for X, Y,

28 18 Majorization Bounds for Ritz Values of Hermitian Matrices where the columns of X and Y can be chosen so that C diag(cosθ(x, Y)) = X H Y. (1.12) Later we will often choose bases X, Y for X, Y as in (1.12) Majorization Majorization inequalities are comparison relations between real vectors. They appear naturally, e.g., when describing the spectrum or singular values of sums and products of matrices. Majorization is a well developed theoretical field, applied extensively in matrix analysis, see, e.g., [7, 24, 41]. Here we briefly introduce the subject and state a few important theorems that we will use later. With the notation in [7, pp ] we say that x R n is weakly submajorized by y R n, written x w y, if k x i i=1 k i=1 y i, k = 1, 2,...n, (1.13) while x is weakly supermajorized by y, written x w y, if k x i i=1 k i=1 y i, k = 1, 2,...n. (1.14) Finally x is (strongly) majorized by y, written x y, if (1.13) holds together with n x i = i=1 n y i. (1.15) i=1 From the above definitions, one can show directly that { } n n x y x w y, x i = y i, i=1 i=1 x y {x w y, x w y}.

29 1.3 Definitions and Prerequisites 19 Remark 1.2. We rarely use weak supermajorization from now on. We use the term majorization sometimes to generally describe all of the above majorization relations, but usually for, emphasizing with strong whenever needed, whereas we use the term weak majorization for w, often omitting the sub. We only use the precise weak supermajorization in the context of w. The linear inequalities in the various majorization relations define convex sets in R n. Geometrically x y if and only if the vector x is in the convex hull of all vectors obtained by permuting the coordinates of y, see, e.g., [7, Theorem II.1.10]. If x w y then x is in a certain convex set depending on y, but in this case the description is a little more complicated. In particular this convex set is not bounded. However if x, y 0 then the corresponding set becomes a bounded convex polygon. Majorization relations are important because they often provide an intermediate alternative to componentwise and normwise inequalities and often are more precise than the other two. In many contexts we would like to compare in some fashion vectors x, y R n, but a normwise inequality does not provide much information about how the components of x and y compare with each other, and is often weaker than necessary, whereas componentwise inequalities such as x y or x y may simply be false. In such cases, comparing x and y via majorization provides a viable alternative. In particular, when x and y are positive vectors, a case which often occurs in error estimation, comparing them via majorization is an intermediate alternative to componentwise and normwise comparison, see (1.22). Strong and weak w majorization relations only share some properties with the usual inequality relation, so one should deal with them carefully. For example, both and w are reflexive and transitive, but x y and y x only implies that x and y are equal up to permutation; it does not imply that x = y, e.g., x = (1, 0) T, y = (0, 1) T. Similarly x y does not imply the intuitive x+z y+z, as is seen in the example x = (0, 0, 0) T, y = (2, 1, 1) T, z = ( 2, 0, 0) T. So we must be particularly

30 20 Majorization Bounds for Ritz Values of Hermitian Matrices careful of the ordering when we combine results. It can be seen from (1.13) and (1.15) that x + u x + u, e.g., [7, Corollary II.4.3], and this is part of the very useful result {x w y} & {u w v} & x+u+ x +u + w y +v +, (1.16) where this also holds with w replaced by. Here are some other basic majorization and related results that we will use later: x y x y x w y; (1.17) A H(n) λ(±a) = σ(a); (1.18) x ± y w x + y, since from (1.16) x ± y x + y x + y ; (1.19) x y x w y ; see, e.g., [7, Example II.3.5]. (1.20) Many inequality relations between eigenvalues and singular values are succinctly expressed as majorization or weak majorization relations; we use the following theorems later on. The proofs of these theorems are for the most part non trivial so we do not present them here. Instead we refer the reader to the relevant literature. Theorem 1.5. (Lidskii [38], see also, e.g., [7, p. 69]). Let A, B H(n), then λ(a) λ(b) λ(a B). Theorem 1.6. ( Schur s Theorem, see, e.g., [7, p. 35]). Let A H(n), then diag of(a)e λ(a). Theorem 1.7. (see, e.g., [41, Chapter 9, G.1.d], [24, Corollary 3.4.3]). If A, B C n n, then σ(a ± B) w σ(a) + σ(b). Theorem 1.7 extends to the case of three or more matrices, since σ( ) = σ( ) gives via (1.16) and Theorem 1.7, σ(a±b±c) w σ(a±b)+σ(c) w σ(a)+σ(b)+σ(c).

31 1.3 Definitions and Prerequisites 21 Theorem 1.8. ( Weyl s Monotonicity Theorem, see, e.g., [7, Corollary III.2.3]). Let A, H H(n), where H has non-negative eigenvalues. Then λ(a) λ(a + H). (1.21) Theorem 1.9. (see, e.g., [24, Theorem ], [7, p. 75]). σ(ab) A σ(b) and σ(ab) B σ(a) for arbitrary matrices A and B such that AB exists. Theorem (see, e.g., [24, Theorem ]). σ(ab) w σ(a)σ(b) for arbitrary matrices A and B such that AB exists. Remark 1.3. Notice that in the previous theorem we have the product of two vectors σ(a)σ(b). From now on we adopt the convention that a product of vectors used in majorization is performed componentwise. Also, in Theorems 1.9 and 1.10 for rectangular matrices we may need to operate with nonnegative vectors of different lengths. A standard agreement in this case is to add zeros at the end of the shorter vector to match the sizes needed for componentwise arithmetic operations and comparisons. This agreement only makes sense because the components of σ( ) are positive so the extra zeros do not change the ordering. We use this agreement in later proofs. Finally we state two theorems which show that majorization inequalities imply a wide variety of other inequalities, and in particular a wide variety of normwise inequalities. Theorem (see, e.g., [41, Proposition 4.B.1]). Let x, y R k. The inequality k k g(x i ) g(y i ) i=1 i=1 holds for all continuous convex functions g : R R if and only if x y. Theorem (see, e.g., [41, Proposition 4.B.2]). Let x, y R k. The inequality k g(x i ) i=1 k g(y i ) i=1

32 22 Majorization Bounds for Ritz Values of Hermitian Matrices holds for all continuous increasing convex functions g : R R if and only if x w y. Similarly, it holds for all continuous decreasing convex functions g if and only if x w y. Theorem 1.12 has the following important implication. Corollary Let x, y R k have positive components. Then, using the standard definition of a p-norm, see, e.g. [7, p. 84], for any p [1, ], x y x w y x p y p. (1.22) Proof. The first implication is just the same as (1.17). To prove the second, let x and y be nonnegative real vectors with x w y, and for p [1, ) let g(t) = t p, which is continuous, increasing, and convex on [0, ). By Theorem 1.12 it follows that k x p p = x p i i=1 k y p i = y p p. i=1 Taking p th roots gives the second implication for p [1, ). The case of p = holds because for any vector u, u p u as p and non-strict inequalities are preserved at the limit. The implications in (1.22) hold only in the forward direction. For example the converse of the first implication is broken with x = (1, 1) T, y = (2, 0) T. For the converse of the second implication, with p [1, ), we can take x = (1, 0) T, y = ( (2 ) 1 p, ( ) 1 T 2 p) 3 3 showing that x p = 1 < ( ) 4 1/p 3 = y p but that x 1 = 1 > ( ) 1 2 p = y 3 1 so x w y. For p =, take x = (3, 2) T, y = (4, 0) T giving x = 3 < 4 = y but x 1 + x 2 = 5 > 4 = y 1 + y 2 so again x w y. Corollary 1.13 is particularly important where the vector x represents an approximation (positive) error and y is some positive estimate of x. It says that for such bounds, the relation x w y is an intermediate step between a componentwise and a normwise inequality. Numerical analysts like obtaining componentwise inequalities

33 1.3 Definitions and Prerequisites 23 because they give very precise information about the error vectors. On the other hand a componentwise inequality using a particular estimate y is not always possible to achieve, in fact the relations x y and x y are often false. In those cases, numerical analysts usually settle for a normwise relation like x p y p for some p, usually p = 1, 2 or, which is much weaker. With majorization, a numerical analyst has an alternative intermediate way of bounding errors, which is weaker than bounding errors componentwise, but stronger and more precise than bounding errors normwise. We now give an example demonstrating that the classical bound (1.3) cannot be extended to multidimensional subspaces via a standard inequality relation. Consider A = , X = , Y = , X = R(X), Y = R(Y ). Here A is Hermitian, X and Y form orthonormal bases for X and Y respectively, and X is A-invariant. From (1.11), and by calculating the eigenvalues of A, we have cos θ(x, Y) = σ (X H Y ) = 0 1, sin θ(x, Y) = sin 2 θ(x, Y) = 1 0, spr(a) = 1. Moreover, calculating the Ritz values and taking the absolute value of the difference gives λ(x H AX) λ(y H AX) = = spr(a) sin 2 θ(x, Y). 0 Hence the conjectured bound (1.6) simply does not hold if w is replaced by. This example shows even more: even if we replace spr(a) by any other positive

34 24 Majorization Bounds for Ritz Values of Hermitian Matrices constant, the inequality relation above would still break. This means that generalizing (1.3) to the multidimensional setting using sin 2 θ(x, Y) cannot be done via a standard inequality relation, so a generalization via weak majorization as in Conjecture 1.1 is a good alternative. In particular, as suggested by (1.22), generalizing (1.3) using weak majorization is stronger than a direct attempt to generalize the bound using an inequality with p-norms. 1.4 Majorization Bounds for Ritz Values We now have all of the tools needed to prove our main results Theorem 1.14, Theorem 1.17, and their corollaries which essentially establish (1.7), (1.8) as well as some other related statements. In section 1.5 we give an example demonstrating that the conjectured bound (1.6) is sharp, that is, equality can be reached. Our numerical tests suggested that (1.6) holds in all cases; however, in Section 1.5 we also show that the very first step in our proof of Theorem 1.14 does not allow us to prove the full statement (1.6), so a different approach is needed to show (1.6) in all cases. A full proof of (1.6) has recently been established by Knyazev and Argentati in [33]. In Section 1.6 we provide this proof with notation and slight modifications adapted to our current discussion. Theorem ([2, Theorem 3.2]). Let X, Y be subspaces of C n having the same dimension k, with orthonormal bases given by the columns of the matrices X and Y respectively. Let A C n n be a Hermitian matrix, and let X be A-invariant. Then [ λ(x H AX) λ(y H AY ) w spr(a) e cos θ(x, Y) + 1 ] 2 sin2 θ(x, Y). (1.23) Proof. If X and X are any two orthonormal bases of X then X = XU for some unitary matrix U so λ(x H AX) = λ( X H A X). In general, the Ritz values corresponding to a subspace are the same regardless of the choice of basis.

35 1.4 Majorization Bounds for Ritz Values 25 Choose X = [x 1, x 2,...,x k ] and Y = [y 1, y 2,...,y k ] as in (1.12) so that C X H Y is real, square, and diagonal, with the diagonal entries in increasing order. Therefore, C X H Y = diag (cosθ(x, Y)). (1.24) We arbitrarily complete X and Y to unitary matrices [X, X ] and [Y, Y ] U(n), respectively, and consider the 2 2 partition of their unitary product [X, X ] H [Y, Y ]. By construction of X and Y, its k k upper left block is C. We denote its (n k) k lower left block by S (X ) H Y. We obtain [X, X ] H [Y, Y ] = XH Y X HY X H Y XH Y = C XH Y S X H Y Since [X, X ] H [Y, Y ] is unitary, the entries C and S of its first block column satisfy. C 2 + S H S = I. (1.25) Hence S H S is diagonal, S H S = diag(σ 2 (S)), and λ(s H S) = λ(i C 2 ) = e cos 2 θ(x, Y) = sin 2 θ(x, Y), where e is the vector of ones, so the vectors of singular values σ(c) and σ(s) are closely connected, and we derive from this that sin θ(x, Y) T = [σ(s) T, 0,..., 0], (1.26) where max{2k n, 0} zeros are added on the right-hand side to match the number k of angles in the vector θ(x, Y) with the number min{k, n k} of singular values in the vector σ(s). Since X is A-invariant and [X, X ] is unitary then [X, X ] H A [X, X ] = A A 22, A = [X, X ] A A 22 [X, X ] H,

36 26 Majorization Bounds for Ritz Values of Hermitian Matrices where Now from X H AX A 11 H(k), and (X ) H AX A 22 H(n k). (1.27) Y H [X, X ] = [C H, S H ] = [C, S H ], it follows that Y H AY = Y H [X, X ] A A 22 The expression we want to bound now takes the form [X, X ] H Y = CA 11 C + S H A 22 S. (1.28) λ(x H AX) λ(y H AY ) = λ(a 11 ) λ(ca 11 C + S H A 22 S) = λ(a 11 ) λ(ca 11 C) + λ(ca 11 C) λ(ca 11 C + S H A 22 S) (1.29) λ(a 11 CA 11 C) + λ( S H A 22 S), (1.30) where the last line used Lidskii s Theorem 1.5 twice, then (1.16), remembering that λ( ) λ( ), see section Next (1.18), Theorems 1.10 and 1.9, and (1.26) give λ( S H A 22 S) = σ(s H A 22 S) w σ(s H )σ(a 22 S) A 22 sin 2 θ(x, Y). (1.31) This bounds the last term in (1.30). Remark 1.4. These results are also applicable to the proof of Theorem 1.17, so we will refer to the above material again. These two proofs differ in the way the λ(a 11 ) λ(ca 11 C) term is bounded in (1.29). Also, in our later proof of Theorem 1.20, we use the same material up to equation (1.28). We will not reconstruct the definitions of C, S, A 11, and A 22, but will refer to those given here. The second term in (1.30) is bounded by (1.31); we now bound the first term. To do so we use the identity A 11 CA 11 C = (I C)A 11 + CA 11 (I C), (1.32)

37 1.4 Majorization Bounds for Ritz Values 27 together with, see Theorem 1.9 and (1.24), σ((i C)A 11 ) A 11 σ(i C) = A 11 (e cosθ(x, Y)), (1.33) and, see also Theorem 1.10, σ(ca 11 (I C)) w σ(c)σ(a 11 (I C)) σ(a 11 (I C)) A 11 σ(i C) = A 11 (e cosθ(x, Y)). (1.34) Discarding the first C in σ(ca 11 (I C)) is no real loss; see Section 1.5. Using (1.32) and applying (1.18), Theorem 1.7, and (1.16) and (1.17) with (1.33) and (1.34), gives λ(a 11 CA 11 C) = σ((i C)A 11 + CA 11 (I C)) w σ((i C)A 11 ) + σ(ca 11 (I C)) w 2 A 11 (e cos θ(x, Y)). (1.35) This bounds the first term on the right of (1.30). We now combine our bounds. Apply (1.20) followed by (1.19) to (1.30), then use (1.35) and (1.31) with (1.16), together with A 11, A 22 A, to obtain: λ(x H AX) λ(y H AY ) w λ(a11 CA 11 C) + λ( S H A 22 S) w λ(a 11 CA 11 C) + λ( S H A 22 S) w 2 A 11 (e cos θ(x, Y)) + A 22 sin 2 θ(x, Y) A [ 2(e cosθ(x, Y)) + sin 2 θ(x, Y) ]. (1.36) Our final step is to replace A by an expression involving spr(a). Observe that the difference between Ritz values is invariant under any shift α R. So we shift A in a way to minimize A. This situation occurs when 0 is exactly in the middle of the spectrum, in which case A = spr(a)/2. Combining this observation with (1.36) (and remembering (1.17)) completes the proof of (1.23).

ETNA Kent State University

ETNA Kent State University Electronic Transactions on Numerical Analysis. Volume 1, pp. 1-11, 8. Copyright 8,. ISSN 168-961. MAJORIZATION BOUNDS FOR RITZ VALUES OF HERMITIAN MATRICES CHRISTOPHER C. PAIGE AND IVO PANAYOTOV Abstract.

More information

Rayleigh-Ritz majorization error bounds with applications to FEM and subspace iterations

Rayleigh-Ritz majorization error bounds with applications to FEM and subspace iterations 1 Rayleigh-Ritz majorization error bounds with applications to FEM and subspace iterations Merico Argentati and Andrew Knyazev (speaker) Department of Applied Mathematical and Statistical Sciences Center

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 19: More on Arnoldi Iteration; Lanczos Iteration Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical Analysis I 1 / 17 Outline 1

More information

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection Eigenvalue Problems Last Time Social Network Graphs Betweenness Girvan-Newman Algorithm Graph Laplacian Spectral Bisection λ 2, w 2 Today Small deviation into eigenvalue problems Formulation Standard eigenvalue

More information

LARGE SPARSE EIGENVALUE PROBLEMS. General Tools for Solving Large Eigen-Problems

LARGE SPARSE EIGENVALUE PROBLEMS. General Tools for Solving Large Eigen-Problems LARGE SPARSE EIGENVALUE PROBLEMS Projection methods The subspace iteration Krylov subspace methods: Arnoldi and Lanczos Golub-Kahan-Lanczos bidiagonalization General Tools for Solving Large Eigen-Problems

More information

Numerical Methods in Matrix Computations

Numerical Methods in Matrix Computations Ake Bjorck Numerical Methods in Matrix Computations Springer Contents 1 Direct Methods for Linear Systems 1 1.1 Elements of Matrix Theory 1 1.1.1 Matrix Algebra 2 1.1.2 Vector Spaces 6 1.1.3 Submatrices

More information

Approximating the matrix exponential of an advection-diffusion operator using the incomplete orthogonalization method

Approximating the matrix exponential of an advection-diffusion operator using the incomplete orthogonalization method Approximating the matrix exponential of an advection-diffusion operator using the incomplete orthogonalization method Antti Koskela KTH Royal Institute of Technology, Lindstedtvägen 25, 10044 Stockholm,

More information

Majorization for Changes in Ritz Values and Canonical Angles Between Subspaces (Part I and Part II)

Majorization for Changes in Ritz Values and Canonical Angles Between Subspaces (Part I and Part II) 1 Majorization for Changes in Ritz Values and Canonical Angles Between Subspaces (Part I and Part II) Merico Argentati (speaker), Andrew Knyazev, Ilya Lashuk and Abram Jujunashvili Department of Mathematics

More information

LARGE SPARSE EIGENVALUE PROBLEMS

LARGE SPARSE EIGENVALUE PROBLEMS LARGE SPARSE EIGENVALUE PROBLEMS Projection methods The subspace iteration Krylov subspace methods: Arnoldi and Lanczos Golub-Kahan-Lanczos bidiagonalization 14-1 General Tools for Solving Large Eigen-Problems

More information

ANGLES BETWEEN SUBSPACES AND THE RAYLEIGH-RITZ METHOD. Peizhen Zhu. M.S., University of Colorado Denver, A thesis submitted to the

ANGLES BETWEEN SUBSPACES AND THE RAYLEIGH-RITZ METHOD. Peizhen Zhu. M.S., University of Colorado Denver, A thesis submitted to the ANGLES BETWEEN SUBSPACES AND THE RAYLEIGH-RITZ METHOD by Peizhen Zhu M.S., University of Colorado Denver, 2009 A thesis submitted to the Faculty of the Graduate School of the University of Colorado in

More information

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces.

Math 350 Fall 2011 Notes about inner product spaces. In this notes we state and prove some important properties of inner product spaces. Math 350 Fall 2011 Notes about inner product spaces In this notes we state and prove some important properties of inner product spaces. First, recall the dot product on R n : if x, y R n, say x = (x 1,...,

More information

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH V. FABER, J. LIESEN, AND P. TICHÝ Abstract. Numerous algorithms in numerical linear algebra are based on the reduction of a given matrix

More information

22.3. Repeated Eigenvalues and Symmetric Matrices. Introduction. Prerequisites. Learning Outcomes

22.3. Repeated Eigenvalues and Symmetric Matrices. Introduction. Prerequisites. Learning Outcomes Repeated Eigenvalues and Symmetric Matrices. Introduction In this Section we further develop the theory of eigenvalues and eigenvectors in two distinct directions. Firstly we look at matrices where one

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 18 Outline

More information

Foundations of Matrix Analysis

Foundations of Matrix Analysis 1 Foundations of Matrix Analysis In this chapter we recall the basic elements of linear algebra which will be employed in the remainder of the text For most of the proofs as well as for the details, the

More information

The Lanczos and conjugate gradient algorithms

The Lanczos and conjugate gradient algorithms The Lanczos and conjugate gradient algorithms Gérard MEURANT October, 2008 1 The Lanczos algorithm 2 The Lanczos algorithm in finite precision 3 The nonsymmetric Lanczos algorithm 4 The Golub Kahan bidiagonalization

More information

Eigenvalue Problems. Eigenvalue problems occur in many areas of science and engineering, such as structural analysis

Eigenvalue Problems. Eigenvalue problems occur in many areas of science and engineering, such as structural analysis Eigenvalue Problems Eigenvalue problems occur in many areas of science and engineering, such as structural analysis Eigenvalues also important in analyzing numerical methods Theory and algorithms apply

More information

Scientific Computing: An Introductory Survey

Scientific Computing: An Introductory Survey Scientific Computing: An Introductory Survey Chapter 4 Eigenvalue Problems Prof. Michael T. Heath Department of Computer Science University of Illinois at Urbana-Champaign Copyright c 2002. Reproduction

More information

Chapter 6: Orthogonality

Chapter 6: Orthogonality Chapter 6: Orthogonality (Last Updated: November 7, 7) These notes are derived primarily from Linear Algebra and its applications by David Lay (4ed). A few theorems have been moved around.. Inner products

More information

ON ANGLES BETWEEN SUBSPACES OF INNER PRODUCT SPACES

ON ANGLES BETWEEN SUBSPACES OF INNER PRODUCT SPACES ON ANGLES BETWEEN SUBSPACES OF INNER PRODUCT SPACES HENDRA GUNAWAN AND OKI NESWAN Abstract. We discuss the notion of angles between two subspaces of an inner product space, as introduced by Risteski and

More information

arxiv: v1 [math.na] 5 May 2011

arxiv: v1 [math.na] 5 May 2011 ITERATIVE METHODS FOR COMPUTING EIGENVALUES AND EIGENVECTORS MAYSUM PANJU arxiv:1105.1185v1 [math.na] 5 May 2011 Abstract. We examine some numerical iterative methods for computing the eigenvalues and

More information

Stat 159/259: Linear Algebra Notes

Stat 159/259: Linear Algebra Notes Stat 159/259: Linear Algebra Notes Jarrod Millman November 16, 2015 Abstract These notes assume you ve taken a semester of undergraduate linear algebra. In particular, I assume you are familiar with the

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

Eigenvalues and Eigenvectors

Eigenvalues and Eigenvectors Chapter 1 Eigenvalues and Eigenvectors Among problems in numerical linear algebra, the determination of the eigenvalues and eigenvectors of matrices is second in importance only to the solution of linear

More information

Jim Lambers MAT 610 Summer Session Lecture 2 Notes

Jim Lambers MAT 610 Summer Session Lecture 2 Notes Jim Lambers MAT 610 Summer Session 2009-10 Lecture 2 Notes These notes correspond to Sections 2.2-2.4 in the text. Vector Norms Given vectors x and y of length one, which are simply scalars x and y, the

More information

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination Math 0, Winter 07 Final Exam Review Chapter. Matrices and Gaussian Elimination { x + x =,. Different forms of a system of linear equations. Example: The x + 4x = 4. [ ] [ ] [ ] vector form (or the column

More information

ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems

ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems Part I: Review of basic theory of eigenvalue problems 1. Let A C n n. (a) A scalar λ is an eigenvalue of an n n A

More information

5.3 The Power Method Approximation of the Eigenvalue of Largest Module

5.3 The Power Method Approximation of the Eigenvalue of Largest Module 192 5 Approximation of Eigenvalues and Eigenvectors 5.3 The Power Method The power method is very good at approximating the extremal eigenvalues of the matrix, that is, the eigenvalues having largest and

More information

Linear Algebra: Matrix Eigenvalue Problems

Linear Algebra: Matrix Eigenvalue Problems CHAPTER8 Linear Algebra: Matrix Eigenvalue Problems Chapter 8 p1 A matrix eigenvalue problem considers the vector equation (1) Ax = λx. 8.0 Linear Algebra: Matrix Eigenvalue Problems Here A is a given

More information

Numerical Methods I Eigenvalue Problems

Numerical Methods I Eigenvalue Problems Numerical Methods I Eigenvalue Problems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 October 2nd, 2014 A. Donev (Courant Institute) Lecture

More information

Repeated Eigenvalues and Symmetric Matrices

Repeated Eigenvalues and Symmetric Matrices Repeated Eigenvalues and Symmetric Matrices. Introduction In this Section we further develop the theory of eigenvalues and eigenvectors in two distinct directions. Firstly we look at matrices where one

More information

RITZ VALUE BOUNDS THAT EXPLOIT QUASI-SPARSITY

RITZ VALUE BOUNDS THAT EXPLOIT QUASI-SPARSITY RITZ VALUE BOUNDS THAT EXPLOIT QUASI-SPARSITY ILSE C.F. IPSEN Abstract. Absolute and relative perturbation bounds for Ritz values of complex square matrices are presented. The bounds exploit quasi-sparsity

More information

MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators.

MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators. MATH 423 Linear Algebra II Lecture 33: Diagonalization of normal operators. Adjoint operator and adjoint matrix Given a linear operator L on an inner product space V, the adjoint of L is a transformation

More information

G1110 & 852G1 Numerical Linear Algebra

G1110 & 852G1 Numerical Linear Algebra The University of Sussex Department of Mathematics G & 85G Numerical Linear Algebra Lecture Notes Autumn Term Kerstin Hesse (w aw S w a w w (w aw H(wa = (w aw + w Figure : Geometric explanation of the

More information

On prescribing Ritz values and GMRES residual norms generated by Arnoldi processes

On prescribing Ritz values and GMRES residual norms generated by Arnoldi processes On prescribing Ritz values and GMRES residual norms generated by Arnoldi processes Jurjen Duintjer Tebbens Institute of Computer Science Academy of Sciences of the Czech Republic joint work with Gérard

More information

October 25, 2013 INNER PRODUCT SPACES

October 25, 2013 INNER PRODUCT SPACES October 25, 2013 INNER PRODUCT SPACES RODICA D. COSTIN Contents 1. Inner product 2 1.1. Inner product 2 1.2. Inner product spaces 4 2. Orthogonal bases 5 2.1. Existence of an orthogonal basis 7 2.2. Orthogonal

More information

On the influence of eigenvalues on Bi-CG residual norms

On the influence of eigenvalues on Bi-CG residual norms On the influence of eigenvalues on Bi-CG residual norms Jurjen Duintjer Tebbens Institute of Computer Science Academy of Sciences of the Czech Republic duintjertebbens@cs.cas.cz Gérard Meurant 30, rue

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 17 1 / 26 Overview

More information

In English, this means that if we travel on a straight line between any two points in C, then we never leave C.

In English, this means that if we travel on a straight line between any two points in C, then we never leave C. Convex sets In this section, we will be introduced to some of the mathematical fundamentals of convex sets. In order to motivate some of the definitions, we will look at the closest point problem from

More information

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated. Math 504, Homework 5 Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated 1 Find the eigenvalues and the associated eigenspaces

More information

6.4 Krylov Subspaces and Conjugate Gradients

6.4 Krylov Subspaces and Conjugate Gradients 6.4 Krylov Subspaces and Conjugate Gradients Our original equation is Ax = b. The preconditioned equation is P Ax = P b. When we write P, we never intend that an inverse will be explicitly computed. P

More information

MA201: Further Mathematical Methods (Linear Algebra) 2002

MA201: Further Mathematical Methods (Linear Algebra) 2002 MA201: Further Mathematical Methods (Linear Algebra) 2002 General Information Teaching This course involves two types of teaching session that you should be attending: Lectures This is a half unit course

More information

Characterization of half-radial matrices

Characterization of half-radial matrices Characterization of half-radial matrices Iveta Hnětynková, Petr Tichý Faculty of Mathematics and Physics, Charles University, Sokolovská 83, Prague 8, Czech Republic Abstract Numerical radius r(a) is the

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

Iterative methods for symmetric eigenvalue problems

Iterative methods for symmetric eigenvalue problems s Iterative s for symmetric eigenvalue problems, PhD McMaster University School of Computational Engineering and Science February 11, 2008 s 1 The power and its variants Inverse power Rayleigh quotient

More information

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit V: Eigenvalue Problems Lecturer: Dr. David Knezevic Unit V: Eigenvalue Problems Chapter V.4: Krylov Subspace Methods 2 / 51 Krylov Subspace Methods In this chapter we give

More information

The following definition is fundamental.

The following definition is fundamental. 1. Some Basics from Linear Algebra With these notes, I will try and clarify certain topics that I only quickly mention in class. First and foremost, I will assume that you are familiar with many basic

More information

Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method

Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method Solution of eigenvalue problems Introduction motivation Projection methods for eigenvalue problems Subspace iteration, The symmetric Lanczos algorithm Nonsymmetric Lanczos procedure; Implicit restarts

More information

The Eigenvalue Problem: Perturbation Theory

The Eigenvalue Problem: Perturbation Theory Jim Lambers MAT 610 Summer Session 2009-10 Lecture 13 Notes These notes correspond to Sections 7.2 and 8.1 in the text. The Eigenvalue Problem: Perturbation Theory The Unsymmetric Eigenvalue Problem Just

More information

A HARMONIC RESTARTED ARNOLDI ALGORITHM FOR CALCULATING EIGENVALUES AND DETERMINING MULTIPLICITY

A HARMONIC RESTARTED ARNOLDI ALGORITHM FOR CALCULATING EIGENVALUES AND DETERMINING MULTIPLICITY A HARMONIC RESTARTED ARNOLDI ALGORITHM FOR CALCULATING EIGENVALUES AND DETERMINING MULTIPLICITY RONALD B. MORGAN AND MIN ZENG Abstract. A restarted Arnoldi algorithm is given that computes eigenvalues

More information

Key words. conjugate gradients, normwise backward error, incremental norm estimation.

Key words. conjugate gradients, normwise backward error, incremental norm estimation. Proceedings of ALGORITMY 2016 pp. 323 332 ON ERROR ESTIMATION IN THE CONJUGATE GRADIENT METHOD: NORMWISE BACKWARD ERROR PETR TICHÝ Abstract. Using an idea of Duff and Vömel [BIT, 42 (2002), pp. 300 322

More information

MAJORIZATION FOR CHANGES IN ANGLES BETWEEN SUBSPACES, RITZ VALUES, AND GRAPH LAPLACIAN SPECTRA

MAJORIZATION FOR CHANGES IN ANGLES BETWEEN SUBSPACES, RITZ VALUES, AND GRAPH LAPLACIAN SPECTRA SIAM J. MATRIX ANAL. APPL. Vol.?, No.?, pp.?? c 2006 Society for Industrial and Applied Mathematics MAJORIZATION FOR CHANGES IN ANGLES BETWEEN SUBSPACES, RITZ VALUES, AND GRAPH LAPLACIAN SPECTRA ANDREW

More information

HOMEWORK PROBLEMS FROM STRANG S LINEAR ALGEBRA AND ITS APPLICATIONS (4TH EDITION)

HOMEWORK PROBLEMS FROM STRANG S LINEAR ALGEBRA AND ITS APPLICATIONS (4TH EDITION) HOMEWORK PROBLEMS FROM STRANG S LINEAR ALGEBRA AND ITS APPLICATIONS (4TH EDITION) PROFESSOR STEVEN MILLER: BROWN UNIVERSITY: SPRING 2007 1. CHAPTER 1: MATRICES AND GAUSSIAN ELIMINATION Page 9, # 3: Describe

More information

Index. for generalized eigenvalue problem, butterfly form, 211

Index. for generalized eigenvalue problem, butterfly form, 211 Index ad hoc shifts, 165 aggressive early deflation, 205 207 algebraic multiplicity, 35 algebraic Riccati equation, 100 Arnoldi process, 372 block, 418 Hamiltonian skew symmetric, 420 implicitly restarted,

More information

Eigenvalue Problems CHAPTER 1 : PRELIMINARIES

Eigenvalue Problems CHAPTER 1 : PRELIMINARIES Eigenvalue Problems CHAPTER 1 : PRELIMINARIES Heinrich Voss voss@tu-harburg.de Hamburg University of Technology Institute of Mathematics TUHH Heinrich Voss Preliminaries Eigenvalue problems 2012 1 / 14

More information

QR-decomposition. The QR-decomposition of an n k matrix A, k n, is an n n unitary matrix Q and an n k upper triangular matrix R for which A = QR

QR-decomposition. The QR-decomposition of an n k matrix A, k n, is an n n unitary matrix Q and an n k upper triangular matrix R for which A = QR QR-decomposition The QR-decomposition of an n k matrix A, k n, is an n n unitary matrix Q and an n k upper triangular matrix R for which In Matlab A = QR [Q,R]=qr(A); Note. The QR-decomposition is unique

More information

A PRIMER ON SESQUILINEAR FORMS

A PRIMER ON SESQUILINEAR FORMS A PRIMER ON SESQUILINEAR FORMS BRIAN OSSERMAN This is an alternative presentation of most of the material from 8., 8.2, 8.3, 8.4, 8.5 and 8.8 of Artin s book. Any terminology (such as sesquilinear form

More information

Notes on Eigenvalues, Singular Values and QR

Notes on Eigenvalues, Singular Values and QR Notes on Eigenvalues, Singular Values and QR Michael Overton, Numerical Computing, Spring 2017 March 30, 2017 1 Eigenvalues Everyone who has studied linear algebra knows the definition: given a square

More information

Singular Value Decomposition

Singular Value Decomposition Chapter 5 Singular Value Decomposition We now reach an important Chapter in this course concerned with the Singular Value Decomposition of a matrix A. SVD, as it is commonly referred to, is one of the

More information

Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method

Solution of eigenvalue problems. Subspace iteration, The symmetric Lanczos algorithm. Harmonic Ritz values, Jacobi-Davidson s method Solution of eigenvalue problems Introduction motivation Projection methods for eigenvalue problems Subspace iteration, The symmetric Lanczos algorithm Nonsymmetric Lanczos procedure; Implicit restarts

More information

Introduction. Chapter One

Introduction. Chapter One Chapter One Introduction The aim of this book is to describe and explain the beautiful mathematical relationships between matrices, moments, orthogonal polynomials, quadrature rules and the Lanczos and

More information

Linear Algebra Review

Linear Algebra Review Chapter 1 Linear Algebra Review It is assumed that you have had a course in linear algebra, and are familiar with matrix multiplication, eigenvectors, etc. I will review some of these terms here, but quite

More information

Eigenvalues and Eigenvectors

Eigenvalues and Eigenvectors /88 Chia-Ping Chen Department of Computer Science and Engineering National Sun Yat-sen University Linear Algebra Eigenvalue Problem /88 Eigenvalue Equation By definition, the eigenvalue equation for matrix

More information

642:550, Summer 2004, Supplement 6 The Perron-Frobenius Theorem. Summer 2004

642:550, Summer 2004, Supplement 6 The Perron-Frobenius Theorem. Summer 2004 642:550, Summer 2004, Supplement 6 The Perron-Frobenius Theorem. Summer 2004 Introduction Square matrices whose entries are all nonnegative have special properties. This was mentioned briefly in Section

More information

Arnoldi Methods in SLEPc

Arnoldi Methods in SLEPc Scalable Library for Eigenvalue Problem Computations SLEPc Technical Report STR-4 Available at http://slepc.upv.es Arnoldi Methods in SLEPc V. Hernández J. E. Román A. Tomás V. Vidal Last update: October,

More information

Notes on Linear Algebra and Matrix Theory

Notes on Linear Algebra and Matrix Theory Massimo Franceschet featuring Enrico Bozzo Scalar product The scalar product (a.k.a. dot product or inner product) of two real vectors x = (x 1,..., x n ) and y = (y 1,..., y n ) is not a vector but a

More information

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88

Math Camp Lecture 4: Linear Algebra. Xiao Yu Wang. Aug 2010 MIT. Xiao Yu Wang (MIT) Math Camp /10 1 / 88 Math Camp 2010 Lecture 4: Linear Algebra Xiao Yu Wang MIT Aug 2010 Xiao Yu Wang (MIT) Math Camp 2010 08/10 1 / 88 Linear Algebra Game Plan Vector Spaces Linear Transformations and Matrices Determinant

More information

Throughout these notes we assume V, W are finite dimensional inner product spaces over C.

Throughout these notes we assume V, W are finite dimensional inner product spaces over C. Math 342 - Linear Algebra II Notes Throughout these notes we assume V, W are finite dimensional inner product spaces over C 1 Upper Triangular Representation Proposition: Let T L(V ) There exists an orthonormal

More information

On the Ritz values of normal matrices

On the Ritz values of normal matrices On the Ritz values of normal matrices Zvonimir Bujanović Faculty of Science Department of Mathematics University of Zagreb June 13, 2011 ApplMath11 7th Conference on Applied Mathematics and Scientific

More information

Diagonalizing Matrices

Diagonalizing Matrices Diagonalizing Matrices Massoud Malek A A Let A = A k be an n n non-singular matrix and let B = A = [B, B,, B k,, B n ] Then A n A B = A A 0 0 A k [B, B,, B k,, B n ] = 0 0 = I n 0 A n Notice that A i B

More information

LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM

LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM LINEAR ALGEBRA BOOT CAMP WEEK 4: THE SPECTRAL THEOREM Unless otherwise stated, all vector spaces in this worksheet are finite dimensional and the scalar field F is R or C. Definition 1. A linear operator

More information

Numerical Methods - Numerical Linear Algebra

Numerical Methods - Numerical Linear Algebra Numerical Methods - Numerical Linear Algebra Y. K. Goh Universiti Tunku Abdul Rahman 2013 Y. K. Goh (UTAR) Numerical Methods - Numerical Linear Algebra I 2013 1 / 62 Outline 1 Motivation 2 Solving Linear

More information

A Note on Eigenvalues of Perturbed Hermitian Matrices

A Note on Eigenvalues of Perturbed Hermitian Matrices A Note on Eigenvalues of Perturbed Hermitian Matrices Chi-Kwong Li Ren-Cang Li July 2004 Let ( H1 E A = E H 2 Abstract and à = ( H1 H 2 be Hermitian matrices with eigenvalues λ 1 λ k and λ 1 λ k, respectively.

More information

Spanning and Independence Properties of Finite Frames

Spanning and Independence Properties of Finite Frames Chapter 1 Spanning and Independence Properties of Finite Frames Peter G. Casazza and Darrin Speegle Abstract The fundamental notion of frame theory is redundancy. It is this property which makes frames

More information

Mathematical foundations - linear algebra

Mathematical foundations - linear algebra Mathematical foundations - linear algebra Andrea Passerini passerini@disi.unitn.it Machine Learning Vector space Definition (over reals) A set X is called a vector space over IR if addition and scalar

More information

Block Bidiagonal Decomposition and Least Squares Problems

Block Bidiagonal Decomposition and Least Squares Problems Block Bidiagonal Decomposition and Least Squares Problems Åke Björck Department of Mathematics Linköping University Perspectives in Numerical Analysis, Helsinki, May 27 29, 2008 Outline Bidiagonal Decomposition

More information

MAJORIZATION FOR CHANGES IN ANGLES BETWEEN SUBSPACES, RITZ VALUES, AND GRAPH LAPLACIAN SPECTRA

MAJORIZATION FOR CHANGES IN ANGLES BETWEEN SUBSPACES, RITZ VALUES, AND GRAPH LAPLACIAN SPECTRA SIAM J. MATRIX ANAL. APPL. Vol. 29, No. 1, pp. 15 32 c 2006 Society for Industrial and Applied Mathematics MAJORIZATION FOR CHANGES IN ANGLES BETWEEN SUBSPACES, RITZ VALUES, AND GRAPH LAPLACIAN SPECTRA

More information

Krylov Subspace Methods that Are Based on the Minimization of the Residual

Krylov Subspace Methods that Are Based on the Minimization of the Residual Chapter 5 Krylov Subspace Methods that Are Based on the Minimization of the Residual Remark 51 Goal he goal of these methods consists in determining x k x 0 +K k r 0,A such that the corresponding Euclidean

More information

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES 48 Arnoldi Iteration, Krylov Subspaces and GMRES We start with the problem of using a similarity transformation to convert an n n matrix A to upper Hessenberg form H, ie, A = QHQ, (30) with an appropriate

More information

22.4. Numerical Determination of Eigenvalues and Eigenvectors. Introduction. Prerequisites. Learning Outcomes

22.4. Numerical Determination of Eigenvalues and Eigenvectors. Introduction. Prerequisites. Learning Outcomes Numerical Determination of Eigenvalues and Eigenvectors 22.4 Introduction In Section 22. it was shown how to obtain eigenvalues and eigenvectors for low order matrices, 2 2 and. This involved firstly solving

More information

MTH Linear Algebra. Study Guide. Dr. Tony Yee Department of Mathematics and Information Technology The Hong Kong Institute of Education

MTH Linear Algebra. Study Guide. Dr. Tony Yee Department of Mathematics and Information Technology The Hong Kong Institute of Education MTH 3 Linear Algebra Study Guide Dr. Tony Yee Department of Mathematics and Information Technology The Hong Kong Institute of Education June 3, ii Contents Table of Contents iii Matrix Algebra. Real Life

More information

Chapter 6. Algebraic eigenvalue problems Introduction Introduction 113. Das also war des Pudels Kern!

Chapter 6. Algebraic eigenvalue problems Introduction Introduction 113. Das also war des Pudels Kern! 6.0. Introduction 113 Chapter 6 Algebraic eigenvalue problems Das also war des Pudels Kern! GOETHE. 6.0. Introduction Determination of eigenvalues and eigenvectors of matrices is one of the most important

More information

Review problems for MA 54, Fall 2004.

Review problems for MA 54, Fall 2004. Review problems for MA 54, Fall 2004. Below are the review problems for the final. They are mostly homework problems, or very similar. If you are comfortable doing these problems, you should be fine on

More information

Using the Karush-Kuhn-Tucker Conditions to Analyze the Convergence Rate of Preconditioned Eigenvalue Solvers

Using the Karush-Kuhn-Tucker Conditions to Analyze the Convergence Rate of Preconditioned Eigenvalue Solvers Using the Karush-Kuhn-Tucker Conditions to Analyze the Convergence Rate of Preconditioned Eigenvalue Solvers Merico Argentati University of Colorado Denver Joint work with Andrew V. Knyazev, Klaus Neymeyr

More information

Direct methods for symmetric eigenvalue problems

Direct methods for symmetric eigenvalue problems Direct methods for symmetric eigenvalue problems, PhD McMaster University School of Computational Engineering and Science February 4, 2008 1 Theoretical background Posing the question Perturbation theory

More information

Alternative correction equations in the Jacobi-Davidson method

Alternative correction equations in the Jacobi-Davidson method Chapter 2 Alternative correction equations in the Jacobi-Davidson method Menno Genseberger and Gerard Sleijpen Abstract The correction equation in the Jacobi-Davidson method is effective in a subspace

More information

Lecture Notes for Inf-Mat 3350/4350, Tom Lyche

Lecture Notes for Inf-Mat 3350/4350, Tom Lyche Lecture Notes for Inf-Mat 3350/4350, 2007 Tom Lyche August 5, 2007 2 Contents Preface vii I A Review of Linear Algebra 1 1 Introduction 3 1.1 Notation............................... 3 2 Vectors 5 2.1 Vector

More information

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.

More information

Large-scale eigenvalue problems

Large-scale eigenvalue problems ELE 538B: Mathematics of High-Dimensional Data Large-scale eigenvalue problems Yuxin Chen Princeton University, Fall 208 Outline Power method Lanczos algorithm Eigenvalue problems 4-2 Eigendecomposition

More information

Lecture notes: Applied linear algebra Part 1. Version 2

Lecture notes: Applied linear algebra Part 1. Version 2 Lecture notes: Applied linear algebra Part 1. Version 2 Michael Karow Berlin University of Technology karow@math.tu-berlin.de October 2, 2008 1 Notation, basic notions and facts 1.1 Subspaces, range and

More information

6 Inner Product Spaces

6 Inner Product Spaces Lectures 16,17,18 6 Inner Product Spaces 6.1 Basic Definition Parallelogram law, the ability to measure angle between two vectors and in particular, the concept of perpendicularity make the euclidean space

More information

Numerical Linear Algebra Homework Assignment - Week 2

Numerical Linear Algebra Homework Assignment - Week 2 Numerical Linear Algebra Homework Assignment - Week 2 Đoàn Trần Nguyên Tùng Student ID: 1411352 8th October 2016 Exercise 2.1: Show that if a matrix A is both triangular and unitary, then it is diagonal.

More information

On the Perturbation of the Q-factor of the QR Factorization

On the Perturbation of the Q-factor of the QR Factorization NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. ; :1 6 [Version: /9/18 v1.] On the Perturbation of the Q-factor of the QR Factorization X.-W. Chang McGill University, School of Comptuer

More information

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix

Scientific Computing with Case Studies SIAM Press, Lecture Notes for Unit VII Sparse Matrix Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/sccswebpage Lecture Notes for Unit VII Sparse Matrix Computations Part 1: Direct Methods Dianne P. O Leary c 2008

More information

NEW ESTIMATES FOR RITZ VECTORS

NEW ESTIMATES FOR RITZ VECTORS MATHEMATICS OF COMPUTATION Volume 66, Number 219, July 1997, Pages 985 995 S 0025-5718(97)00855-7 NEW ESTIMATES FOR RITZ VECTORS ANDREW V. KNYAZEV Abstract. The following estimate for the Rayleigh Ritz

More information

Principal Angles Between Subspaces and Their Tangents

Principal Angles Between Subspaces and Their Tangents MITSUBISI ELECTRIC RESEARC LABORATORIES http://wwwmerlcom Principal Angles Between Subspaces and Their Tangents Knyazev, AV; Zhu, P TR2012-058 September 2012 Abstract Principal angles between subspaces

More information

MATRICES ARE SIMILAR TO TRIANGULAR MATRICES

MATRICES ARE SIMILAR TO TRIANGULAR MATRICES MATRICES ARE SIMILAR TO TRIANGULAR MATRICES 1 Complex matrices Recall that the complex numbers are given by a + ib where a and b are real and i is the imaginary unity, ie, i 2 = 1 In what we describe below,

More information

Rigid Geometric Transformations

Rigid Geometric Transformations Rigid Geometric Transformations Carlo Tomasi This note is a quick refresher of the geometry of rigid transformations in three-dimensional space, expressed in Cartesian coordinates. 1 Cartesian Coordinates

More information

Block Lanczos Tridiagonalization of Complex Symmetric Matrices

Block Lanczos Tridiagonalization of Complex Symmetric Matrices Block Lanczos Tridiagonalization of Complex Symmetric Matrices Sanzheng Qiao, Guohong Liu, Wei Xu Department of Computing and Software, McMaster University, Hamilton, Ontario L8S 4L7 ABSTRACT The classic

More information

4.6 Bases and Dimension

4.6 Bases and Dimension 46 Bases and Dimension 281 40 (a) Show that {1,x,x 2,x 3 } is linearly independent on every interval (b) If f k (x) = x k for k = 0, 1,,n, show that {f 0,f 1,,f n } is linearly independent on every interval

More information