On the Ritz values of normal matrices Zvonimir Bujanović Faculty of Science Department of Mathematics University of Zagreb June 13, 2011 ApplMath11 7th Conference on Applied Mathematics and Scientific Computing
Contents 1 Convergence of the restarted Arnoldi method 2 Interlacing for normal matrices 3 Characterization of Ritz values using a Cauchy matrix On the Ritz values of normal matrices
Contents 1 Convergence of the restarted Arnoldi method 2 Interlacing for normal matrices 3 Characterization of Ritz values using a Cauchy matrix On the Ritz values of normal matrices Convergence of the restarted Arnoldi method
The Arnoldi method Task: determine a few eigenvalues λ 1,λ 2,... of a large, sparse matrix A C n n. The Arnoldi method: choose an initial vector v; iteratively build an m-dimensional Krylov subspace K m (v;a) = span{v,av,...,a m 1 v}; this is represented as AV m = V m H m + (β m v m+1 )em; approximate eigs of A with the Ritz values θ 1,θ 2,...,θ m (the eigenvalues of H m ). On the Ritz values of normal matrices Convergence of the restarted Arnoldi method
The Arnoldi method Task: determine a few eigenvalues λ 1,λ 2,... of a large, sparse matrix A C n n. The Arnoldi method: choose an initial vector v; iteratively build an m-dimensional Krylov subspace K m (v;a) = span{v,av,...,a m 1 v}; this is represented as AV m = V m H m + (β m v m+1 )em; approximate eigs of A with the Ritz values θ 1,θ 2,...,θ m (the eigenvalues of H m ). On the Ritz values of normal matrices Convergence of the restarted Arnoldi method
The restarted Arnoldi method The restarted Arnoldi method (ARPACK, Matlab eigs): 1 initialization: build K 1 (v;a),k 2 (v;a),...,k m p (v;a); 2 augmentation: build K m p+1 (v;a),k m p+2 (v;a),...,k m (v;a); 3 restart: choose ṽ = (A σ 1 I)(A σ 2 I)...(A σ p I)v; ṽ is such that K m p (ṽ;a) is already computed; set v = ṽ, go to 2. Roots σ 1,...,σ p are called shifts. Exact shifts = unwanted Ritz values from K m (v;a). Methods of restart: Sorensen (IRAM), Stewart (Krylov Schur) On the Ritz values of normal matrices Convergence of the restarted Arnoldi method
The restarted Arnoldi method The restarted Arnoldi method (ARPACK, Matlab eigs): 1 initialization: build K 1 (v;a),k 2 (v;a),...,k m p (v;a); 2 augmentation: build K m p+1 (v;a),k m p+2 (v;a),...,k m (v;a); 3 restart: choose ṽ = (A σ 1 I)(A σ 2 I)...(A σ p I)v; ṽ is such that K m p (ṽ;a) is already computed; set v = ṽ, go to 2. Roots σ 1,...,σ p are called shifts. Exact shifts = unwanted Ritz values from K m (v;a). Methods of restart: Sorensen (IRAM), Stewart (Krylov Schur) On the Ritz values of normal matrices Convergence of the restarted Arnoldi method
The restarted Arnoldi method The restarted Arnoldi method (ARPACK, Matlab eigs): 1 initialization: build K 1 (v;a),k 2 (v;a),...,k m p (v;a); 2 augmentation: build K m p+1 (v;a),k m p+2 (v;a),...,k m (v;a); 3 restart: choose ṽ = (A σ 1 I)(A σ 2 I)...(A σ p I)v; ṽ is such that K m p (ṽ;a) is already computed; set v = ṽ, go to 2. Roots σ 1,...,σ p are called shifts. Exact shifts = unwanted Ritz values from K m (v;a). Methods of restart: Sorensen (IRAM), Stewart (Krylov Schur) On the Ritz values of normal matrices Convergence of the restarted Arnoldi method
Convergence: Hermitian matrices Theorem (Sorensen) Let: A Hermitian; Shifts = smallest p Ritz values. Then under some mild conditions: Ritz values converge to m p largest eigenvalues. The main tool: Cauchy interlacing property λ j+(n m) θ j λ j monotonicity of Ritz values: θ (1) j θ (2) j θ (3) j... bounded from above: θ (k) j λ j restarting preserves wanted Ritz values On the Ritz values of normal matrices Convergence of the restarted Arnoldi method
Convergence: Hermitian matrices Theorem (Sorensen) Let: A Hermitian; Shifts = smallest p Ritz values. Then under some mild conditions: Ritz values converge to m p largest eigenvalues. The main tool: Cauchy interlacing property λ j+(n m) θ j λ j monotonicity of Ritz values: θ (1) j θ (2) j θ (3) j... bounded from above: θ (k) j λ j restarting preserves wanted Ritz values On the Ritz values of normal matrices Convergence of the restarted Arnoldi method
Convergence: the general case Embree: Convergence fails for a whole class of matrices, any dimension. The wanted Ritz vectors are removed from the Krylov subspace. Example: 105 105 matrix A On the Ritz values of normal matrices Convergence of the restarted Arnoldi method
Convergence: the general case Embree: Convergence fails for a whole class of matrices, any dimension. The wanted Ritz vectors are removed from the Krylov subspace. Example: 105 105 matrix A, 10-dim. Krylov subspace On the Ritz values of normal matrices Convergence of the restarted Arnoldi method
Convergence: the general case Embree: Convergence fails for a whole class of matrices, any dimension. The wanted Ritz vectors are removed from the Krylov subspace. Example: 105 105 non-normal matrix A, 10-dim. Krylov subspace On the Ritz values of normal matrices Convergence of the restarted Arnoldi method
Contents 1 Convergence of the restarted Arnoldi method 2 Interlacing for normal matrices 3 Characterization of Ritz values using a Cauchy matrix On the Ritz values of normal matrices Interlacing for normal matrices
Reduction to the Hermitian case Definition Let ϕ [0,2π. For a C, denote a (ϕ) = 1 2 ( e iϕ a + e iϕ a). For a matrix A, denote: A (ϕ) = 1 2 ( e iϕ A + e iϕ A ). On the Ritz values of normal matrices Interlacing for normal matrices
Restarting the Arnoldi method and A (ϕ) Interlacing does not hold for θ (ϕ). Interlacing for θ (A arbitrary): λ (ϕ) (n m)+j θ j λ (ϕ) j. θ θ (ϕ) < 1 2 r. H = X AX, X + = [X x], H + = X + AX +, then θ j+1,+ θ j θ j,+. Restart preserves θ (ϕ), does not preserve θ (no such shifts exist!) On the Ritz values of normal matrices Interlacing for normal matrices
Restarting the Arnoldi method and A (ϕ) Interlacing does not hold for θ (ϕ). Interlacing for θ (A arbitrary): λ (ϕ) (n m)+j θ j λ (ϕ) j. θ θ (ϕ) < 1 2 r. H = X AX, X + = [X x], H + = X + AX +, then θ j+1,+ θ j θ j,+. Restart preserves θ (ϕ), does not preserve θ (no such shifts exist!) On the Ritz values of normal matrices Interlacing for normal matrices
Contents 1 Convergence of the restarted Arnoldi method 2 Interlacing for normal matrices 3 Characterization of Ritz values using a Cauchy matrix
Geometry of the Ritz values Let A C n n. When does a set {θ 1,...,θ m } of complex numbers represent Ritz values for A, using any m dimensional subspace? A normal, Krylov subspaces.
Geometry of the Ritz values Notation: Λ = (λ 1,...,λ n ), Θ = (θ 1,...,θ m ) Cauchy matrix P(Λ,Θ) C m n, C(Λ,Θ) = 1 1 1 λ 1 θ 1 λ 2 θ... 1 λ n θ 1 1 1 1 λ 1 θ 2 λ 2 θ... 2 λ n θ 2. 1 1 1 λ 1 θ m λ 2 θ m... λ n θm P(Λ,Θ) i,j = (λ i θ j ) m q=1 q j Cm n λ i θ q 2 x = (x 1,...,x n ) > 0 if each x i 0 and at least one x i > 0. x = (x 1,...,x n ) 0 if all x i > 0.
Geometry of the Ritz values Notation: Λ = (λ 1,...,λ n ), Θ = (θ 1,...,θ m ) Cauchy matrix P(Λ,Θ) C m n, C(Λ,Θ) = 1 1 1 λ 1 θ 1 λ 2 θ... 1 λ n θ 1 1 1 1 λ 1 θ 2 λ 2 θ... 2 λ n θ 2. 1 1 1 λ 1 θ m λ 2 θ m... λ n θm P(Λ,Θ) i,j = (λ i θ j ) m q=1 q j Cm n λ i θ q 2 x = (x 1,...,x n ) > 0 if each x i 0 and at least one x i > 0. x = (x 1,...,x n ) 0 if all x i > 0.
Geometry of the Ritz values Notation: Λ = (λ 1,...,λ n ), Θ = (θ 1,...,θ m ) Cauchy matrix P(Λ,Θ) C m n, C(Λ,Θ) = 1 1 1 λ 1 θ 1 λ 2 θ... 1 λ n θ 1 1 1 1 λ 1 θ 2 λ 2 θ... 2 λ n θ 2. 1 1 1 λ 1 θ m λ 2 θ m... λ n θm P(Λ,Θ) i,j = (λ i θ j ) m q=1 q j Cm n λ i θ q 2 x = (x 1,...,x n ) > 0 if each x i 0 and at least one x i > 0. x = (x 1,...,x n ) 0 if all x i > 0.
Geometry of the Ritz values Theorem (B.) Let A C n n be normal with eigenvalues Λ = (λ 1,..., λ n ). Let Θ = (θ 1,...,θ m ) denote an m-tuple of distinct complex numbers such that λ i θ j, for all i = 1,...,n and j = 1,...,m. The following claims are equivalent: (a) There exists a vector v C n such that θ 1,...,θ m are Ritz values for A from K m (v;a). (b) Linear system P(Λ,Θ)ˆv = 0 has a solution ˆv > 0. (c) Linear system C(Λ,Θ)w = 0 has a solution w > 0. The main idea of the proof: The minimum π(a)v for π P m is obtained at π = κ(h m). The proof is constructive.
Geometry of the Ritz values Theorem (B.) Let A C n n be normal with eigenvalues Λ = (λ 1,..., λ n ). Let Θ = (θ 1,...,θ m ) denote an m-tuple of distinct complex numbers such that λ i θ j, for all i = 1,...,n and j = 1,...,m. The following claims are equivalent: (a) There exists a vector v C n such that θ 1,...,θ m are Ritz values for A from K m (v;a). (b) Linear system P(Λ,Θ)ˆv = 0 has a solution ˆv > 0. (c) Linear system C(Λ,Θ)w = 0 has a solution w > 0. The main idea of the proof: The minimum π(a)v for π P m is obtained at π = κ(h m). The proof is constructive.
Feasibility problem in LP Linear programming: solving the existence Ax = b,x > 0 is as hard as optimizing c τ x max, under condition Ax b.
The dual problem Theorem The following is equivalent to (a)(b)(c): (d) For each u C m there exists an eigenvalue λ i such that ) Re ( m j=1 u j λ i θ j 0. Farkas lemma: for M R n m it holds that: (i) Either Mv = 0 for some v 0, or u M > 0 for some u R m. (ii) Either Mv = 0 for some v > 0, or u M 0 for some u R m.
Geometry of the Ritz values
Geometry of the Ritz values
Geometry of the Ritz values
Geometry of the Ritz values
Corollary: Interlacing for the Hermitian matrices Corollary (compare: Parlett, The Symmetric Eigenvalue Problem) Suppose A is Hermitian. Let θ 1 >... > θ m denote the Ritz values from a m-dimensional Krylov subspace; θ j λ i. Then for all j = 1,...,m 1 there exists an eigenvalue λ such that θ j+1 < λ < θ j.
Corollary: Ritz values = convex combination of eigenvalues Using P(Λ,Θ)ˆv = 0 we get: θ j = n i=1 λ i ( ) P i,j = m q=1 λ i θ q 2 ˆv i 2 q j P i,j P 1,j + P 2,j +... + P n,j.
Restarting Arnoldi: an example An unwanted Ritz value cannot equal the extreme λ. Can it equal the second largest eigenvalue? λ 1 = 8, λ 2 = 4. 4 A = 3 2 2 3 3.9 8. Is there an initial vector v such that for K 3 (v;a) it holds θ 3 λ 2?
Restarting Arnoldi: an example An unwanted Ritz value cannot equal the extreme λ. Can it equal the second largest eigenvalue? λ 1 = 8, λ 2 = 4. 4 A = 3 2 2 3 3.9 8. Is there an initial vector v such that for K 3 (v;a) it holds θ 3 λ 2?
Restarting Arnoldi: an example θ = λ 2 + 10 6 = 4 + 10 6.
Restarting Arnoldi: an example Optimization (θ 1,θ 2 max,θ 3 λ 2 ): v = The Ritz values from K 3 (v;a) are: 0.775693250142234 0.028238213050217 0.028273977339263 0.629795237727870 0.007818295736434. θ 1 = 4.183227620474041 + 0.692098306609705i, θ 2 = 4.183227620474041 0.692098306609705i, θ 3 = 4.000000000000762. Restarting to dimension 2 erases the component in direction u 2.
Bibliography D. C. Sorensen Implicit Application of Polynomial Filters in a k-step Arnoldi Method, SIAM J. Matrix Anal. Appl. (vol. 13, 1992.) G. W. Stewart A Krylov Schur Algorithm for Large Eigenproblems, SIAM J. Matrix Anal. Appl. (vol. 23, 2001.) M. Embree The Arnoldi Eigenvalue Iteration with Exact Shifts Can Fail, SIAM J. Matrix Anal. Appl. (vol. 31, 2009.) S. M. Malamud Inverse spectral problem for normal matrices and the Gauss Lucas theorem, Transactions of the American Mathematical Society (vol. 357, 2005.) J. F. Queiró, A. L. Duarte Imbedding conditions for normal matrices, Linear Algebra and its Applications (vol. 430, 2009.) Z. Bujanović Krylov Type Methods for Large Scale Eigenvalue Computations, PhD Thesis, Univ. of Zagreb (2011.)