Vandermonde matrices with Chebyshev nodes

Size: px

Start display at page:

Download "Vandermonde matrices with Chebyshev nodes"

Benedict Butler
6 years ago
Views:

1 Available online at Linear Algebra and its Applications 48 (008) Vandermonde matrices with Chebyshev nodes Ren-Cang Li Department of Mathematics, University of Texas at Arlington, P.O. Box 9408, Arlington, TX , United States Received 3 February 005; accepted 5 October 007 Available online 4 December 007 Submitted by V. Mehrmann Abstract For an N N Vandermonde matrix V N = (αj i ) ij N with translated Chebyshev zero nodes, it is discovered that VN T admits an explicit QR decomposition with the R-factor consisting of the coefficients of the translated Chebyshev polynomials. This decomposition then leads to an exact expression for the Frobenius condition number of its submatrix V k,n = (αj i ) i k, j N (so-called a rectangular Vandermonde matrix), bounds on individual singular value, and more. It is explained how these results can be used to establish asymptotically optimal lower bounds on condition numbers of real rectangular Vandermonde matrices and nearly optimally conditioned real rectangular Vandermonde matrices on a given interval. Extensions are also made for V N with nodes being zeros of any translated orthogonal polynomials other than Chebyshev ones. Similar results hold for V N with translated Chebyshev extreme nodes, too, owing to that VN T admits an explicit QR-like decomposition. Close formulas of or tight bounds on the residuals are also presented for the conjugate gradient method, the imal residual method, and the generalized imal residual method on certain linear systems Ax = b with A having eigenvalues the same as the nodes mentioned above. As a by-product, they yield positive definite linear systems for which the residuals by the conjugate gradient method are always comparable to the existing error bounds for all iteration steps. 007 Elsevier Inc. All rights reserved. AMS classification: 65F0; 65F35; 5A06 Keywords: Vandermode matrix; Chebyshev polynomial; Condition number; Conjugate gradient method; MINRES; GMRES; Rate of convergence Supported in part by NSF CAREER award Grant No. CCR-98750, and by NSF Grant Nos. DMS and DMS address: rcli@uta.edu /$ - see front matter ( 007 Elsevier Inc. All rights reserved. doi:0.06/j.laa

2 804 R.-C. Li / Linear Algebra and its Applications 48 (008) Introduction Given N numbers α,α,...,α N called nodes, the associated Vandermonde Matrix is defined as def α α α N V N = (.) α N α N αn N V N appears naturally in polynomial interpolations [6]. For k N, let V k,n = (αj i ) i k, j N, the first k rows of V N, so-called a rectangular Vandermonde matrix. V k,n appears naturally in the best polynomial fitting in the least squares sense, and, as we shall see later, plays an important role in analyzing the convergence of certain Krylov subspace methods such as the Conjugate Gradient method (CG) for positive definite linear systems, the Minimal Residual method (MINRES) for indefinite Hermitian linear systems, the Generalized Minimal Residual method (GMRES) for normal linear systems. For polynomial fitting applications, the condition number of V k,n may account for the accuracy of the computed best fitting polynomials in the worst case scenario; while for analyzing the convergence of Krylov subspace methods, certain imizations involving V k,n need to be answered. In [], various asymptotically optimal lower bounds on condition numbers of real V N were established. The key idea was to use the coefficients of Chebyshev polynomials of the first kind to arrive at lower bounds on the norms of VN and explicitly compute the l -operator norm of VN with translated Chebyshev zero nodes with the help of Gautschi s formula []. Two similar bounds were also obtained by Beckermann [3]. CG is widely used to solve a positive definite linear system Ax = b of order N. In[], we revisited an example of Meinardus [5], which was used to show that the following well-known and frequently referenced error bound (see, e.g., [8,4,7,30]): r k A r 0 A A b x k A A b A [Δ k + Δ k ] (.) was sharp, where r k = b Ax k is the kth CG residual for the kth CG approximation x k, the def M-vector norm z M = z Mz, (A) = A A is the spectral condition number, generic notation is for either the spectral norm (the largest singular value) of a matrix or the euclidian length of a vector, and def t + Δ t = for t>0, (.3) t that will be used frequently later for different t. But Meinardus [5] did it only for k = N, the next-to-the-last CG iteration, for which (.) is an equality. In [], we obtained a close expression of the ratio r k A / r 0 A for all k N and in particular we showed that this ratio is always within a factor of / of the upper bound by (.). This statement is only partially true, as often special algorithms that take advantage of its structure yield more accurately computed results than otherwise suggested by the condition number [5].

3 R.-C. Li / Linear Algebra and its Applications 48 (008) In this paper, we shall exploit the technique, namely, QR or QR-like decompositions of V k,n, that was successfully used in [], to accomplish three tasks. First we shall investigate the conditioning of V k,n with translated Chebyshev nodes defined in Section and nearly optimally conditioned V k,n if its nodes are restricted to an interval [α, β]. In particular we will show that with translated Chebyshev nodes on [α, β], c N d F (V k,n )/ρ k c N d, (.4) where c,c,d,d are constants, the condition number F (V k,n ) is defined later in (.7), and max {β + + β,β + } + β for α = β, ρ = { ( max β + ) } + β ( ) (.5), + + β for 0 = α<β. Our success in [] relied on solving exactly a imization problem involving V k,n with translated Chebyshev extreme nodes. As our second task in this paper, we shall solve a few similar imization problems, each of which relates to residuals of CG, MINRES, and GMRES on certain linear systems and possibly gives extreme examples that are difficult for the methods. The arguments that lead to the solutions to these imization problems are along the lines in [] which only dealt with one particular case. This idea of creating extreme examples, especially for CG, works in principle for the symmetric Lanczos algorithm for eigenvalue problems, too, and the interested reader is referred to [0] for how this can be done. Our third task is to establish various inequalities involving the singular values of V k,n. By default, we denote and order the singular values of an n-by-m matrix X as σ (X) σ (X) σ {m,n} (X). (.6) Matrix condition numbers are usually defined for square matrices, but they can be extended without difficulty to non-square matrices. We define X s Frobenius condition number by F (X) def = {m,n} j= [σ j (X)] {m,n} j= [σ j (X)]. (.7) The first factor in the right-hand side is just X s Frobenius norm X F. If the rank of X is less than {m, n}, then σ {m,n} (X) = 0 and thus F (X) =. Later in Section 3, l p -condition number p (X) will be defined, too, for p. The rest of this paper is organized as follows. Section briefly presents relevant preliary material related to Chebyshev polynomials of the first kind. In Section 3, a lower bound on the l p -condition number of V k,n with all α j [α, β] (a given interval) is established. This bound, combined with later results on V k,n with translated Chebyshev nodes, lead to asymptotically optimal lower bounds in (.4) and (.5) for the case α = β and the case αβ 0. Section 4 is devoted to V k,n with translated Chebyshev zero nodes, while Section 5 to V k,n with translated Chebyshev extreme nodes. Possible extensions to V k,n with translated zeros of an orthogonal polynomial are outlined in Section 6. Finally we give some conclusions in Section 7. Some of the proofs for theorems in Sections 4 and 5 are quite long and tedious and thus are postponed to Appendixes A and B. Notation. Throughout this paper, C n m is the set of all n m complex matrices, C n = C n, and C = C. Similarly define R n m, R n, and R except replacing the word complex by real. I n (or simply I if its dimension is clear from the context) is the n n identity matrix, and e j is its

4 806 R.-C. Li / Linear Algebra and its Applications 48 (008) jth column. X Y for two Hermitian matrices means that Y X is positive semidefinite. The superscript takes conjugate transpose while T takes transpose only. We shall also adopt MATLAB-like convention to access the entries of vectors and matrices. i : j is the set of integers from i to j inclusive. For vector u and matrix X, u (j) is u s jth entry, X (i,j) is X s (i, j)th entry, diag(u) is the diagonal matrix with (diag(u)) (j,j) = u (j) ; X s submatrices X (k:l,i:j), X (k:l,:), and X (:,i:j) consist of intersections of row k to row l and column i to column j, rowk to row l, and column i to column j, respectively. ξ is the largest integer that is no bigger than ξ; while ξ is the smallest integer that is no less than ξ. j means the first term is halved, while for j both the first and last terms are halved. Some of the estimates for condition numbers in this paper are not intended to be best possible but rather to correctly show their asymptotical speeds as k and N go to. For this purpose, we shall use a k,n N bk,n to mean that there are constants c,c,d, and d such that c N d a k,n /b k,n c N d ; and a k,n b k,n if a k,n /b k,n ask, N.. Chebyshev polynomials The mth Chebyshev polynomial of the first kind is T m (t)=cos(m arccos t) for t, (.) = ( ) m t + t ( m + t t ) for t. (.) It frequently shows up in numerical analysis and computations because of its numerous nice properties, for example T m (t) for t and T m (t) grows extremely fast for t >. It can be verified (see, e.g., []) ( ) ( ) + t T m t + t T m = t [Δm t + Δ m t ] for /= t>0. (.3) Given two (real or complex) numbers ω/= 0 and τ, the mth Translated Chebyshev Polynomial in z of degree m is defined by T m (z; ω,τ) def = T m (z/ω + τ), (.4) = a mm z m + a m m z m + +a m z + a 0m, (.5) where a jm a jm (ω, τ) are functions of ω and τ. Their explicit dependence on ω and τ is often suppressed for convenience. Define Chebyshev zero nodes: t jm = cos θ jm, θ jm = j π, j m, (.6) m Chebyshev extreme nodes: τ jm = cos ϑ jm, ϑ jm = j π, 0 j m. (.7) m It can be seen that t jm ( j m) are the zeros of T m (t), and τ jm (0 j m) are the extreme points of T m (t) in [, ]. Define accordingly

5 R.-C. Li / Linear Algebra and its Applications 48 (008) Translated Chebyshev zero nodes: Translated Chebyshev extreme nodes: t tr jm = ω(t jm τ), j m, (.8) τ tr jm = ω(τ jm τ), 0 j m. (.9) Let upper triangular R m C m m, a matrix-valued function in ω and τ, be a 00 a 0 a 0 a 0m a a a m R m R m (ω, τ) def = a a m, (.0).....ȧm m i.e., the jth column consists of the coefficients of T j (z; ω,τ). In[], we defined S m,p (ω, τ) as /p m S m,p (ω, τ) = a jm p for p. j=0 Also explicit formulas were found for p = and τ = 0: ( ) m S m, (ω, 0) = T m (ι/ω) ω + + ω, (.) where ι =, and for all real τ with τ : S m, (ω, τ)=t m ( τ +/ ω ) ( ) ω + τ + ( ω + τ ) m. (.) In particular for τ =, [ ] m S m, (ω, ) ω + +. (.3) ω No explicit formulas or tight bounds are known for other τ, however. For p/=, S m,p (ω, τ) relates to S m, (ω, τ) by inequalities (m + ) /p S m, (ω, τ) S m,p (ω, τ) S m, (ω, τ), (.4) (m + )/ /p S m, (ω, 0) S m,p (ω, 0) S m, (ω, 0), (.5) where /p + /p = which will be used to define p throughout the later sections. In the rest of this paper, by default, ω/= 0 and τ are two prescribed (real or complex) numbers, but when there is an interval [α, β] in the context, they are given by ω = β α > 0, τ = α + β β α. (.6) Although in [9] these formulas were established with an interval [α, β] in the context (see (.6)), the proofs there can be easily modified for ω and τ that have nothing to do with any interval [α, β].

6 808 R.-C. Li / Linear Algebra and its Applications 48 (008) The linear transformation t(z) = z ω + τ = ( z α + β ) β α maps z [α, β] one-to-one and onto t [, ]. (.7) 3. Lower bounds for p (V k,n ) This section concerns V N with all α j [α, β] but otherwise general. The results will be used in the later sections. We start by defining the l p vector and operator norm. Given p, the l p -norm of n-vector u and the l p -operator norm of matrix X are defined as n u p = u (j) p j= /p, X p = max u/=0 Xu p u p. It can be proved that X p = X T p ; see, e.g., [8]. Define 3 lub p (V k,n ) def V T = u/=0 k,n u p u p, p (V k,n ) def = V k,n p lub p (V k,n ). (3.) Such definition is unlikely new, and is consistent with the case for p = and the square matrix case. In fact lub (V k,n ) = σ (V k,n ), V k,n s smallest singular value, and for k = N, it can be shown that lub p (V N ) = VN p. Theorem 3.. For V k,n with all nodes α j [α, β], we have N /p lub p (V k,n ) S k,p (ω, τ), (3.) S k,p (ω, τ) p (V k,n ) V k,n p, (3.3) N /p where ω and τ are defined as in (.6). Proof. Let v be the vector of the coefficients of T k (z; ω,τ) T k (z/ω + τ)such that v (j+) = a jk for 0 j k. Then Vk,N T v = (T k (α /ω + τ) T k (α /ω + τ) T k (α N /ω + τ)) T, which yields Vk,N T v p N /p because T k (z/ω + τ) for z [α, β]. We therefore have Vk,N T lub p (V k,n ) = u p u/=0 u p V T k,n v p v p This gives (3.). Eq. (3.3) is the consequence of (3.) and (3.). N /p ( k j=0 a j,k p ) /p. We now specialize Theorem 3. to the case α = β or αβ 0. 3 For matrix X C N k and k<n, it should be defined as lub p (X) = u/=0 Xu p u p.

7 R.-C. Li / Linear Algebra and its Applications 48 (008) Theorem 3.. Let V k,n be with all nodes α j [α, β], and suppose max j α j η max{ α, β } for some η>0. () If α = β, then p (V k,n ) max{,η k β k } S k,(β, 0) k/ /p (3.4) N { /p ( ) k k/ /p max β + + β, N /p η k (β + + β ) k }. (3.5) () If 4 0 α<β,then p (V k,n ) max{,η k β k } S k,(ω, τ) k /p (3.6) N /p max{,η k β k } S k,(β/, ) k /p (3.7) N {( /p ) (k ) β + + β, k /p N /p max η k ( + + β ) (k ) }. (3.8) Proof. Since max j α j η max{ α, β },wehave V k,n p max Vk,N T e j p = max j j ( k ) /p α j ip i=0 max{, max α j k } j max{,η k α k,η k β k }. (3.9) Now if α = β, then ω = β and τ = 0, and by (.) and (.5) S k,p (β, 0) k/ /p S k, (β, 0) ( ) k k/ /p β + + β. This, together with Theorem 3. and (3.9), lead to (3.4) and (3.5). If 0 α <β, then ω = (β α)/ β/ and τ. Since S k,p (ω, τ) = S k,p ( ω, τ ) and it is increasing in τ and decreasing in ω [], we have by (.3) and (.4) 4 Any result for this case in the rest of this paper holds for the case α<β 0 as well.

8 80 R.-C. Li / Linear Algebra and its Applications 48 (008) S k,p (ω, τ) k /p S k, (ω, τ) k /p S k, (β/, ) k /p ( ) (k ) β + + β. This, together with Theorem 3. and (3.9), lead to (3.6), (3.7), and (3.8). Taking η = in Theorem 3. gives better lower bounds, and it is always possible to do so by picking an appropriate (e.g., the smallest) [α, β] that contains all α j. What makes us to have such a factor η is that it gives us some flexibility in applying the theorem later on. One of such occasions is for V N with translated Chebyshev zero nodes in the next section. While (3.6) is for all α j [α, β] with 0 α and possibly α/= 0, (3.7) is actually for all α j [0,β] only, i.e., α = 0. Asymptotical expansions (3.5) and (3.8) are quite informative: () p (V k,n ) as a function of k grows at least exponentially; () Both expressions are smallest at β which may indicate that the best p (V k,n ),or equivalently the smallest p (V k,n ), may occur for some V k,n with max j α j. This is indeed true; see Theorem 4.4 below. 4. V N with Chebyshev zero nodes In this section, V N has the translated Chebyshev zero nodes α j = tjn tr ( j N), except possibly in Theorem 4.4. Set T 0 (t N ) T 0 (t N ) T 0 (t NN ) def T (t N ) T (t N ) T (t NN ) T N =.... (4.) T N (t N ) T N (t N ) T N (t NN ) Then VN T R N = T T N and it can be verified that [0, pp ] (4.) T N T T N = (N/)Π, Π = diag(,,,...). (4.3) Here Π R N N. In what follows, we shall let Π have a generic dimension detered by the context. Notice that T N is real while V N and R N may be complex if ω or τ is. Eqs. (4.) and (4.3) essentially give a QR decomposition for VN T after normalizing TT N s columns to have unit norm. Extracting the first k columns from both sides of V T N = TT N R N yields the following theorem. Theorem 4. [0]. Let V N have the translated Chebyshev zero nodes α j = tjn tr ( j N) defined by (.6) and (.8) with arbitrary ω/= 0 and τ, and let upper triangular R k be defined as in (.0) and T N as in (4.). Then Vk,N T = TT k,n R def k, where T k,n = (T N ) (:k,:) is T N s first k rows.

9 4.. Condition number F (V k,n ) R.-C. Li / Linear Algebra and its Applications 48 (008) Let V k,n be its complex conjugate. By Theorem 4., wehave 5 V k,n V T k,n = R k T k,n T T k,n R k = Rk (T N T T N ) (:k,:k)rk = (N/)Rk Π Rk, (4.4) (V k,n Vk,N T = (/N)R k ΠRk. (4.5) Consequently, [ σj (V k,n ) ] N ( = trace j j Rk m [ σj (V k,n ) ] = Π / Rk N ) Π R k = N Π / Rk, (4.6) F k F = [ S j, (ω, τ) ]. (4.7) N j=0 Eq. (4.6) involves Rk, making it a little hard to use without inverting R k first. We might be better off by using j [σ j (V k,n )] = V k,n F when it comes to estimate V k,n F. Nevertheless it relates the singular values to the coefficients of T j (z; ω,τ) in a nontrivial way. Theorem 4.. Let V k,n have the translated Chebyshev zero nodes defined by (.6) and (.8) with arbitrary ω/= 0 and τ. Then F (V k,n ) = V k,n F k [ S j, (ω, τ) ]. N j=0 While this theorem does not require any interval in the context, previously similar estimates (bounds) for the condition numbers of V k,n were done for α j = tjn tr on [α, β] only with either α = β or αβ 0, k = N, and p (V N ) for p = in Gautschi [] and for all p in Li []. We shall now estimate F (V k,n ) for all k N. 5 One can also have V k,n Vk,N T = (N/)R T k Π Rk, (V k,n Vk,N T ) = (/N)R k ΠRk T. Although they are very similar to (4.4) and (4.5) but can be quite different because ω and/or τ and thus V k,n may possibly be complex. Whether these two identities have any use remains to be seen.

10 8 R.-C. Li / Linear Algebra and its Applications 48 (008) Lemma 4.. Let α j = tjn tr ( j N) on [α, β]. Then max j α j =ηmax{ α, β }, where ( ) +δ + δ cos N π δ π + O(N 4 ), if αβ 0, 6N η = ( ) δ + +δ cos N π +δ π + O(N 4 ), if αβ < 0, 6N and δ = { α, β }/ max{ α, β }. Consequently for k N { (k )( δ) η k π + O((k ) N 4 ), if αβ 0, 6N (k )(+δ) π + O((k ) N 4 ), if αβ < 0. 6N Proof. The expression for η is the consequence of tjn tr = ω(t jn τ) = β α cos j N π + β + α. The asymptotical expansion for η k can be gotten through expanding exp((k ) ln η). With this lemma, we have V k,n F kn max{, max tjn tr j k } kn [max{, α, β }] k, (4.8) V k,n F max{, max tjn tr j k } [max{, α, β }] k. (4.9) Together they imply [ σj (V k,n ) ] N = Vk,N F [max{, α, β }] k. (4.0) j By (.4), (4.0), and Theorem 4.,wehave F (V k,n ) N k [max{, α, β }] k S j, (ω, τ). (4.) Theorem 4.3. Let V k,n have the translated Chebyshev zero nodes on [α, β]. () If α = β, then j=0 F (V k,n ) max{,β N k }S k, (β, 0) (4.) ( ) k N max β + + ( ) k β, β + + β. (4.3) Thus those V k,n with α = β are nearly optimally conditioned among all V k,n with the translated Chebyshev zero nodes on [ β,β]. () If 0 α<β,then F (V k,n ) max{,β N k }S k, (ω, τ) (4.4) max{,β k }S k, (β/, ).

11 R.-C. Li / Linear Algebra and its Applications 48 (008) In particular, if 0 = α<β, F (V k,n ) max N max{,β k }S k, (β/, ) { ( ) (k ) ( N max β + + β, + } ) (k ) + β. (4.5) Thus those V k,n with β are nearly optimally conditioned among all V k,n with the translated Chebyshev zero nodes on [0,β]. Proof. For the case α = β, ω = β and τ = 0, and thus we have (4.) and (4.3)by(4.) and (.). For the case αβ 0, use (4.) and (.) to complete the proof. Since F (V N ) N p (V N ), all (4.) (4.5) for k = N can be deduced from results in []. They, in fact, together with Theorems 3. and 3. for k = N were the foundation in []. Now we have similar conclusions for all V k,n with arbitrary nodes in [α, β] for all k. They are summarized in the following theorem, with the help of Theorem 3.. Theorem 4.4. If α = β or αβ 0, then subject to all α j [α, β] and (max j α j ) N N [max{ α, β }] N, F (V k,n ) N [max{, α, β }] k S k, (ω, τ), and thus V k,n with α j = tjn tr ( j N) in [α, β] is nearly optimally conditioned among all V k,n with nodes in [α, β]. In particular, subject to the same constraints above, { F (V k,n ) N RHS of (4.3), f or α = β, RHS of (4.5), f or 0 = α<β. Furthermore α j R F(V k,n ) N ( + ) k, F(V k,n ) N ( + ) (k ). α j 0 But questions such as what asymptotically optimal lower bounds and/or nearly optimally conditioned Vandermonde matrices are for interval with α /= β, α<0, and β>0were not answered in [9,]. With Theorem 4. here, we are one step closer as we shall explain. Answers to both questions would be firm if we could show that the right-hand sides of (3.3) and (4.) were equivalent in the sense of. N We suspect this would be very much true because it would be reasonable to expect S j, (ω, τ) to be (almost) nondecreasing as j increases, but we have no proof for it for now. So we formulate a conjecture as follows, which has been known to be true for α = β or αβ 0, by exaing (.) and (.). Conjecture 4.. For α<β, ω and τ defined as in (.6), k S j, (ω, τ) N S k, (ω, τ). j=0

12 84 R.-C. Li / Linear Algebra and its Applications 48 (008) Extreme examples for CG, MINRES, and GMRES A key component in [0,] for devising examples to achieve the sharpness of the existing error bound (.) for CG is the computation of the imization problem u() = diag(g)vk,n T u for V k,n with the translated Chebyshev zero or extreme nodes on an interval. The convergence analysis of GMRES for Ax = b with normal A ends up with the same computation, too, except possibly with complex α j. In this subsection we will present close formulas or tight bounds for similar imization problems. Extreme examples can then be devised based on them, upon noticing (4.7) and (4.9) below. Any normal matrix A C N N admits the following eigen-decomposition: A = Q Q,Q Q = I N, = diag(α,α,...,α N ). (4.6) For CG, all α j > 0, and the kth residual r k satisfies r k A = y K k b Ay A [8, p. 306] = u () = diag(g)v T k+,n u, [] (4.7) where K k K k (A, b) is the kth Krylov subspace of A on b defined as K k K k (A, b) def = span{b,ab,...,a k b}, (4.8) g = Λ / Q b and u C k+. In general for GMRES with normal A, including MINRES proposed in [6] for possibly indefinite Hermitian A, the kth residual r k satisfies r k = y K k b Ay [6, 8] = u () = diag(g)v T k+,n u, [6, 0, 4] (4.9) where g = Q b. After substituting k for k +, the convergence analysis in both (4.8) and (4.0) rests on computing or estimating diag(g)vk,n T u = [et (V k,n[diag( g )] Vk,N T ) e ] /, (4.0) u () = g g where g is the vector obtained by taking entrywise absolute values. In obtaining the equality in (4.0) by Lemma 4. below, we have assumed that diag(g)vk,n T has rank k. In light of the CG connection in (4.8), we shall set def = max j α j j α j = ttr N t tr NN < def = β α, (4.) whenever all α j = tjn tr on [α, β] with 0 α, for which T j (τ) = [Δj + Δ j ] by (.3). For some of the quantities in the theorems below that are undefined for α = 0, they should be interpreted as by setting α 0 +. For example, =+ and Δ = when α = 0. In what follows we shall compute the quantity in (4.0) under the following three different situations:

13 R.-C. Li / Linear Algebra and its Applications 48 (008) () α j = tjn tr with ω/= 0 and τ arbitrary, and g (j) =( j N) for which g = N; () α j = tjn tr on [α, β] with 0 α, and g (j) =/ tjn tr ( j N) for which {[ ] / αβ N g = N N for 0 <α, N + N N (4.) /β for 0 = α; (3) α j =tjn tr on [α,β] with 0 α, and g (j) = tjn tr ( j N)for which g = N(α+β)/. Norm formulas for g in Items and 3 above will be implied in the proofs later. Define k Φ τ,k = T j (τ). (4.3) j=0 In its present general form, the next lemma was proved in [0]. It was also implied by the proof of [6, Theorem.]. See also [3]. Lemma 4.. If Z has full column rank, then u () = Zu =[e T (Z Z) e ] /. (4.4) Theorem 4.5. Let V k,n have the translated Chebyshev zero nodes defined by (.6) and (.8) with arbitrary ω/= 0 and τ, all g (j) =, and let Φ τ,k be as in (4.3). Then for k N diag(g)vk,n T u u () = If α j = t tr jn = (Φ τ,k ) /. (4.5) g on [α, β] with 0 α, then for k N diag(g)vk,n T u [ ] Δ k + Δ (k ) g (Φ τ,k ) / = u () = [ < Δ k In particular if α = 0, then (Φ τ,k ) / = / k. ] + Δ (k ). (4.6) Proof. In (4.0), [diag( g )] = I N. Eq. (4.5)gives e T (V k,nvk,n T ) e = N a 00 + k a 0j, (4.7) N j= where a 0j a 0j (ω, τ) as in (.5) is the constant term of T j (z/ω + τ), and thus a 0j = T j (τ). This gives (4.5). For [α, β] with 0 α, (.) implies the in(4.6). If α = 0, then τ = and thus all a 0j =. Let us see how tight the existing error bound [Δ k + Δ (k ) ] is for the case in (4.6) and 0 <α. Since / asn, we shall instead use a slightly larger bound [Δ k + Δ (k ) ] in our comparison. We have ] / [3(k ) + + Δk Δ RHS of (4.6) LHS of (4.6) Δ k

14 86 R.-C. Li / Linear Algebra and its Applications 48 (008) Ratios of upper bounds over the actual Translated Chebyshev zero nodes on [α,β] with β= α=0 α=0 Residual bounds over actual residuals 0 0 Translated Chebyshev zero nodes on [α,β] with β= α=0, j g (j) = , max j g (j) = α=0, j g (j) = , max j g (j) = N= k k Fig. 4.. Ratios of (Δ k + Δ (k ) ) over u() = diag(g)vk,n T u / g for V k,n with translated Chebyshev zero nodes on [α, ] (thus = /α); Left: all g (j) =; Right: random g with g =. [ ] / 3(k ) + Δ (k ) + Δ Δ, RHS of (4.6) LHS of (4.6) Δ = + as k. Δ 4 At the left of Fig. 4., this ratio is plotted for k 50 for = 0 and 0, respectively. Notice the ratio quickly converges to + { , for = 0 4 =, , for = 0 (4.8). Also plotted at the right of Fig. 4. are the ratios [ Δ k + Δ (k ) ] u() = diag( g)v T k,n u / g, where g is a random vector. It shows that the existing upper bound is still pretty good for the case, too. This can be partially explained. Let g be as in Theorem 4.5, g C N, and N j g (j) N maxj g (j) γ =, γ max =. g g Then diag(g)vk,n T γ u diag( g)vk,n T u g u () = g diag(g)v T k,n u γ max diag(g)v T k,n u g. (4.9) Since u() = g is comparable to [Δ k + (k ) ], the quantity in the middle of (4.9) should be, too, unless possibly when the magnitude of g (j) varies too much.

15 R.-C. Li / Linear Algebra and its Applications 48 (008) Proofs of the next two theorems are rather complicated and technical and thus will be postponed to Appendix A. Theorem 4.6. Let α j = tjn tr on [α, β] with 0 α, and g (j) =/ then we have for k N diag(g)vk,n T u [ = ϱ k Δ k u () = g t tr jn ( j N). If 0 <α, ] + Δ (k ), (4.30) where ϱ k, strictly decreasing in k, is given by ( ) ( ) Δ(N ) Δ Δ N ϱk = Δ(k ) Δ [N (k )] Δ N. If α = 0, then we have for k N diag(g)vk,n T u [ = k ] /. (4.3) u () = g N Theorem 4.7. Let α j = tjn tr on [α, β] with 0 α, and g (j) = we have for k N + Φ τ,k + diag(g)vk,n T u u () = g + tjn tr ( j N).If 0 <α,then Φ τ,k +, (4.3) where Φ τ,k is as in (4.3). If α = 0, then we have for k N diag(g)vk,n T u 3 = u () = g k(4k ). (4.33) 4.3. Bounds on individual singular values Let the diagonal entries of the right-hand side of (4.5) bed j ( j k). Then d = k a 0i, d j = k a j i for j k. (4.34) N N i=0 By Schur s theorem [5, p. 35], we have i=j Theorem 4.8. Let V k,n have the translated Chebyshev zero nodes defined by (.6) and (.8) with arbitrary ω/= 0 and τ. The set of eigenvalues of (V k,n Vk,N T ) which is {[σ j (V k,n )] } k j= majorizes {d j } k j= defined as in (4.34). Recall our default ordering (.6) on singular values to get [σ (V k,n )] [σ (V k,n )] [σ k (V k,n )]. What the majorization in Theorem 4.8 means is if we let {d j }k j= be the non-increasing reordering of {d j } k j=, i.e.,

16 88 R.-C. Li / Linear Algebra and its Applications 48 (008) d d d k, then i [ σj (V k,n ) ] i d j for i k, (4.35) j= j= which can also be equivalently stated as k k [σ j (V k,n )] d j for i k. (4.36) j=i j=i Corollary 4.. Under the conditions of Theorem 4.8, / k σ i (V k,n ) for i k, (4.37) j=i d j k [d ] / σ (V k,n ) j= d j /. (4.38) Proof. Eq. (4.36) implies k [σ i (V k,n )] d j for i k, j=i which yields (4.37). Take i = in(4.35), combining with (4.37) for i =, to get (4.38). The lower bounds in (4.37) are guaranteed very sharp for i = because of (4.38), but may not be so for i/=. Our numerical calculations for various interval [α, β] show that they are pretty good for the first few smallest singular values, and then deteriorate as i becomes bigger. But for max{ α, β }, the lower bounds are sharp for both ends of singular values (i.e., the largest and smallest ones). Exactly the same thing can be done with (4.4) upon extracting the diagonal entries of Rk Π Rk. But these diagonal entries relate to the coefficients in a much more complicated way. We shall not pursue this here. 5. V N with Chebyshev extreme nodes For the sake of presentation, throughout this section, n = N, and V N will have nodes α j+ = τjn tr for 0 j n as defined in (.7). Set T 0 (τ 0n ) T 0 (τ n ) T 0 (τ nn ) def T (τ 0n ) T (τ n ) T (τ nn ) S N =.... (5.) T N (τ 0n ) T N (τ n ) T N (τ nn )

17 R.-C. Li / Linear Algebra and its Applications 48 (008) Then V T N R N = S T N. S N is always real, while V N and R N may not. Let Ω = diag(,,,...,, ) R N N, (5.) and define Υ def = S N ΩS T N = n Ω. (5.3) The last equality is well-known; see [, p. 33]. Now VN T = ST N R N. Extracting the first k columns from both sides yields the following theorem. Theorem 5. []. Let V N have the translated Chebyshev extreme nodes α j+ = τjn tr (0 j n) defined by (.7) and (.9) with arbitrary ω/= 0 and τ, and let upper triangular R k be defined as in (.0) and S N as in (5.). Then Vk,N T = ST k,n R k, where S k,n = (S N ) (:k,:) is S N s first k rows. 5.. Condition number F (V k,n ) A rough bound for Υ = (n/)ω is n I N Υ ni N, (5.4) which is probably good enough for most occasions. Eq. (5.4) implies immediately n I k Υ (:k,:k) ni k, n I k [ ] Υ (:k,:k) n I k. (5.5) By Theorem 5., similarly to the derivations in (4.4) and (4.5), we have 6 V k,n ΩV T k,n =R k S k,n ΩS T k,n R k =Rk (S N ΩS T N ) (:k,:k)rk =Rk Υ (:k,:k) Rk, (5.6) ( V k,n ΩVk,N) T [ ] =Rk Υ(:k,:k) R k. (5.7) Since (/)I N Ω I N and (5.5), we have V k,nvk,n T V k,nωvk,n T V k,nv T n Rk Rk Υ (:k,:k) Rk nrk R k ( V k,n V T k,n n R kr k R k k,n, ) ( V k,n ΩVk,N T [ ] Υ(:k,:k) R k n R krk. Rk, ) ( V k,n V T k,n), 6 Again one can also have V k,n ΩVk,N T = (n/)r T k Υ (:k,:k) Rk,(V k,n ΩVk,N T ) = (/n)r k [Υ (:k,:k) ] Rk T, similarly to (5.6) and (5.7).

18 80 R.-C. Li / Linear Algebra and its Applications 48 (008) Consequently, n Rk F j [ σj (V k,n ) ] R n, (5.8) F k k j, (ω, τ)] n j=0[s j [σ j (V k,n )] k [S j, (ω, τ)]. (5.9) n j=0 Theorem 5.. Let V k,n have the translated Chebyshev extreme nodes τjn tr defined by (.7) and (.9) with arbitrary ω/= 0 and τ. Then V k,n F k [S j, (ω, τ)] n F (V k,n ) V k,n F k [S j, (ω, τ)] n. j=0 j=0 As what we did for Theorem 4., we may specialize this theorem onto interval [α, β] with α = β or αβ 0 in almost the same way to conclude that Theorems 4.3 and 4.4 remain true with all occurrences of the translated Chebyshev zero nodes replaced by the translated Chebyshev extreme nodes. Detail is omitted. 5.. Extreme examples for CG, MINRES, and GMRES V N here provides another example to achieve the sharpness of the existing error bound for CG, as well as many examples with closed formulas for or tight bounds on MINRES and GMRES residuals. We shall compute or estimate diag(g)vk,n T u u () = g = [et (V k,n[diag( g )] V T k,n ) e ] / g (5.0) under the following three different situations. In obtaining the equality in (5.0) by Lemma 4., we have assumed that diag(g)vk,n T has rank k. Similarly to (4.), we also define and, but now the two are equal: = max j τjn tr j τjn tr def = τ 0n tr τnn tr def = β α, (5.) whenever α j+ = τ tr jn on [α, β] with 0 α, for which T j (τ) = (Δj + Δ j ). () α j+ = τjn tr with ω/= 0 and τ arbitrary, and for j N g (j) = Ω (j,j) (5.) for which g = n;

19 R.-C. Li / Linear Algebra and its Applications 48 (008) () α j+ = τjn tr on [α, β] with 0 <α, and /τjn tr g (j+) =, for j {0,n}, /τjn tr, for j n (5.3) for which [] [ n g = ω τ Δn + ] / Δ n Δ n ; Δ n (3) α j+ = τjn tr on [α, β] with 0 α, and τjn tr g (j+) =, for j {0,n}, τjn tr, for j n (5.4) for which g = n(α + β). Note the case for (5.3) has already been dealt with in []. Define { k j=0 T j (τ), for k n, Ψ τ,k = T j (τ), for k = N. k j=0 (5.5) Theorem 5.3. Let V k,n have the translated Chebyshev extreme nodes defined by (.7) and (.9) with arbitrary ω/= 0 and τ, and let g be as in (5.). Then for k N u () = diag(g)v T k,n u n = (Ψ τ,k ) /. (5.6) If α j+ = τjn tr on [α, β] with 0 <α,then for k N (Ψ τ,k ) / = u () = diag(g)vk,n T u [Δ k n + Δ (k ) ]. (5.7) In particular if α = 0, then (Ψ τ,k ) / = / k for k n and / n for k = N. Proof. Note [diag( g )] = Ω.By(5.7) and (5.0), we have diag(g)v k,n T u =[e T (V k,nωvk,n T u () = ) e ] / =[y [ Υ (:k,:k) ] y] /, where y C k and y (j+) = a 0j = T j (τ). This gives (5.6). Eq. (5.7) follows from (5.6) and the existing CG error bound (.). If α = 0, then τ = and thus all a 0j =. Li [] obtained the following theorem in which (5.8) for k = N is due to [5]. Theorem 5.4 []. Let 0 <α<β,g R N as in (5.3), and V k,n have the translated Chebyshev extreme nodes. Then for k N

20 8 R.-C. Li / Linear Algebra and its Applications 48 (008) Ratios of upper bounds over the actual Translated Chebyshev extreme nodes on [/,] N =49 =0 =0 Residual bounds over actual residuals Translated Chebyshev extreme nodes on [/,] =0 =0 N= k k Fig. 5.. Ratios of [Δ k in (5.); Right: g as in (5.3). + Δ (k ) ] over u() = diag(g)v k,n T u g, where α j+ = τjn tr on [/, ]. Left: g as diag(g)vk,n T u [ = ρ k Δ k u () = g where ( ) ( < + Δn Δ n + ρk = + Δ(k ) ] + Δ (k ), (5.8) + Δ [n (k )] Δ n + ). (5.9) Fig. 5. plots the ratio of [Δ k + Δ (k ) ] over (5.0) for α j+ = τjn tr on [/, ] with two different g. It is interesting to notice a sudden drop at k = N for g as in (5.), and for g as in (5.3), the ratio is one at k = and N and for all other k the ratios is no bigger than, as guaranteed by Theorem 5.4. Proof of the next theorem is given in Appendix B. Theorem 5.5. Let 0 α<β,g R N as in (5.4), and V k,n have the translated Chebyshev extreme nodes. If 0 <α,we have for k N diag(g)v + Ξ / k,n T u u () = g + Ξ /, (5.0) where Ξ = Ψ τ,k + + η, Ψ τ,k is as in (5.5) and η = 0 for k n and η = T N (τ) for k = N. In particular, if α = 0, then we have diag(g)vk,n T u 3 = u () = g k(4k for k n, (5.) ) and it is zero for k = N.

21 R.-C. Li / Linear Algebra and its Applications 48 (008) Bounds on individual singular values Let the diagonal entries of R k [Υ (:k,:k) ] R k be δ j = ej T R k[υ (:k,:k) ] Rk e j for j k. (5.) By Schur s theorem [5, p. 35], we have Theorem 5.6. Let V k,n have the translated Chebyshev extreme nodes defined by (.7) and (.9) with arbitrary ω/= 0 and τ. The set of eigenvalues of (V k,n ΩVk,N T ) which is {[σ j (V k,n Ω / )] } k j= majorizes {δ j } k j= defined as in (5.). Let the nonincreasing re-ordering of δ j s be δ j s, i.e., δ δ δ k. Theorem 5.6 implies, upon noticing σ j (V k,n ) σ j (V k,n Ω / ) σ j (V k,n ) i j= [ σj (V k,n ) ] i δ j for i k, (5.3) j= k [ σj (V k,n ) ] k δ j for i k. (5.4) j=i j=i Corollary 5.. Under the conditions of Theorem 5.6, / k σ i (V k,n ) for i k, (5.5) δ j j=i [ δ ] / σ (V k,n ) k δ j j= /. (5.6) Proof. Eq. (5.4) implies k [σ i (V k,n )] δ j for i k, j=i which yields (5.5). Take i = in(5.3), combining with (5.5) for i =, to get (5.6). Our comments at the end of Section 4.3 apply here, too. 6. V N with other orthogonal polynomial zero nodes Part of the material in Section 4 can be naturally extended to V N whose nodes are the zeros of the nth translated orthogonal polynomial from any orthogonal polynomial system. We shall only provide an outline here. Let p j (t), j = 0,,,...,denote a sequence of normalized orthogonal polynomials with respect to some weight function w(t) on an interval which may be open, half

22 84 R.-C. Li / Linear Algebra and its Applications 48 (008) open, or closed. For a list of well-known orthogonal polynomials such as T m, Legendre polynomials, etc., the reader is referred to [7, p. 57] or any books on orthogonal polynomials. Orthogonal polynomials have many beautiful properties. Useful to us here is that p m (t) is guaranteed to have exactly m distinct zeros 7 t jm, j =,,...,m, in the interval, and [3] m λ jm p r (t jm )p s (t jm ) = j= {, if r = s, 0, otherwise, for 0 r, s m (6.) as the result of the fact that the Gaussian quadrature formula m φ(t)w(t)dt ω jm φ(t jm ) j= is exact for all polynomials of degree no higher than m and p r (t)p s (t)w(t)dt = 0 for r /= s and for r = s, where the integral is taken over the support of w(t), ω jm are Christoffel numbers for p m.for numerical values of the nodes and Christoffel numbers for various orthgonal polynomials, the reader is referred to []. Given ω/= 0 and τ, define translated orthogonal polynomials by p m (z; ω,τ) def = p m (z/ω + τ), = a mm z m + a m m z m + +a m z + a 0m, whose zeros are t tr jm = ω(t jm τ), j m. Let V N be with α j = tjn tr and set R N as in (.0) but with a ij here. Define p 0 (t N ) p 0 (t N ) p 0 (t NN ) def p (t N ) p (t N ) p (t NN ) P N =.... (6.) p N (t N ) p N (t N ) p N (t NN ) Then V T N R N = P T N, and P N Ω N P T N = I N, (6.3) where Ω N = diag(ω N,ω N,...,ω NN ). Therefore VN T = PT N R N. Extracting the first k columns from both sides yields Vk,N T = PT k,n R k, similarly to Theorem 4., where P k,n = (P N ) (:k,:) is P N s first k rows. 7 Up to now, t jm is reserved for the zeros of the mth Chebyshev polynomial of the first kind. It, along with other previously reserved notation, will be reassigned in this section.

23 R.-C. Li / Linear Algebra and its Applications 48 (008) Let Ω k,n = (Ω N ) (:k,:k).wehave V k,n Ω k,n V T k,n =R k P k,n Ω k,n P T k,n R =Rk (P N Ω N P T N ) (:k,:k)rk k =Rk Rk, (6.4) (V k,n Ω k,n Vk,N T =R k Rk. (6.5) Upon noticing ( j ω jn )I k Ω k,n (max j ω jn )I k,wehave Rk F max j ω j [σ j (V k,n )] R k jn j ω, (6.6) jn F ( ω jn ) R k F j j [σ j (V k,n )] (max j ω jn ) R k F. (6.7) Now bounds on F (V k,n ) can be easily obtained in terms of a ij. For CG, MINRES, or GMRES residuals, we have exactly diag(g)v k,n T u = u () = k p j (τ) j=0 /, (6.8) where g C N with g (j) = ω jn. We may also get bounds on individual singular value σ j (V k,n ), following the lines of previous sections. For the similarity reason, detail is omitted. 7. Conclusions Vandermonde matrices with translated Chebyshev zero and extreme nodes are shown to have various interesting properties, derived from simple QR or QR-like decompositions. These decompositions allow us to obtain the behavior of their condition numbers, and bounds on their singular values. This simple QR-like decomposition for V N with translated Chebyshev zero nodes is shared by a much larger class of V N, i.e., those with nodes being translated zeros of any orthogonal polynomial. Consequently we can get about the same things as what we can get for V N with translated Chebyshev zero nodes. There are two immediate applications of studying Vandermonde matrices with translated Chebyshev nodes. The first one is to establish asymptotically optimal lower bounds on condition numbers of real rectangular Vandermonde matrices and to establish nearly optimally conditioned real rectangular Vandermonde matrices on a given interval. Previously similar results were obtained for real square Vandermonde matrices in [3,9,], except in [9,] where rectangular Vandermonde matrices were discussed but the results there were not satisfactory. The second application is their implications to the convergence analysis of CG, MINRES, and GMRES. It is observed that superlinear convergence [4,9] often occurs for CG; while the existing error bound [7,5] only guarantees linear convergence. Results in this paper can be used to construct examples as in [0,] for which errors of approximations by CG are comparable to the existing error bounds at all iteration steps.

24 86 R.-C. Li / Linear Algebra and its Applications 48 (008) Acknowledgments The author wishes to thank anonymous referees constructive comments that considerably improved the presentation of this paper. Appendix A. Proofs of Theorems 4.6 and 4.7 Lemma A.. Let θ kn = k N π( k N), and l an integer. Then N { ( ) cos lθ kn = m N, if l = mn f or some integer m, 0, otherwise. k= Proof. This is most likely well-known and can be verified by calculations. See also [0]. (A.) Lemma A.. Let Γ = diag(μ + ν cos θ N,...,μ+ ν cos θ NN ), T N as in (4.), Π as in (4.3), and define Θ μ,ν = T N ΓT T N. Then Θ μ,ν = N 4 (μπ + νπ H Π ) = N 4 Π (μπ + νh)π, where H R N N is tridiagonal with diagonal entries 0 and off-diagonal entries. Proof. For 0 i, j N, N (T N ΓT T N ) (i+,j+) = (T N ) (i+,k) (μ + ν cos θ kn )(T T N ) (k,j+) k= N = T i (t kn )(μ + ν cos θ kn )T j (t kn ) k= N = cos iθ kn (μ + ν cos θ kn ) cos jθ kn k= N N =μ cos iθ kn cos jθ kn + ν cos iθ kn cos θ kn cos jθ kn. k= k= So Θ μ,ν = μθ,0 + νθ 0,, where N (Θ,0 ) (i+,j+) = cos iθ kn cos jθ kn, k= N (Θ 0, ) (i+,j+) = cos iθ kn cos θ kn cos jθ kn. k= (A.) It is known Θ,0 = N Π already in (4.3). It remains to compute Θ 0,.

25 R.-C. Li / Linear Algebra and its Applications 48 (008) N N N 4 cos iθ kn cos θ kn cos jθ kn = cos(i + j + )θ kn + cos(i + j )θ kn k= k= k= k= N N + cos(i j + )θ kn + cos(i j )θ kn. Apply Lemma A. to conclude Θ 0, = N 4 Π H Π whose verification is straightforward, albeit tedious. Lemma A.3. Let m N and ξ C such that ( ξπ + H) (:m,:m) is nonsingular. Then the first entry of the solution to ( ξπ + H) (:m,:m) y = e is γ m γ + m for ξ / {, }, ξ y () = (γ m+γ + m ) m for ξ =, m for ξ =, where γ ± = ξ ± ξ. Proof. This is essentially [, Lemma 3.4] when ξ/ {, }. It can be verified that y (j) = ξ j (m j + ) when ξ =±. Proof of Theorem 4.6. We use (4.0). Let Γ = diag(tn tr,ttr N,...,ttr NN ) diag(μ + ν cos θ N,...,μ+ ν cos θ NN ), where μ = ωτ and ν = ω. Then V k,n [diag( g )] Vk,N T =V k,nγ Vk,N T ( ) e T = Γ ( e V k,n Γ ( e T Γ e e T V T = V k,n e ΓVk,N T ) k,n V k,n ΓV T k,n where e = (,,...,) T. Notice Vk,N T = TT k,n R k by Theorem 4. to get V k,n e=v k,n Vk,N T e =R T k (Θ,0) (:k,:k ) R k e =R T k (Θ,0) (:k,:k ) e, V k,n ΓVk,N T =R T k T k,n ΓTT k,n R k =R T k (Θ μ,ν) (:k,:k ) R k, where Θ μ,ν is as in Lemma A.. Recall 8 ), k= 8 This is well-known. See, e.g., [9, pp. 0 03], [3, p. 3].

26 88 R.-C. Li / Linear Algebra and its Applications 48 (008) ( ) ( B B C = C B B ) B B B B C B + B B C B B, assug all inversions exist, where C = B B B B. Wehave e T (V k,n[diag(g)] Vk,N T ) e =[ζ e T Vk,N T (V k,n ΓVk,N T ) V k,n e], where ζ = e T Γ e. But (A.3) e T Vk,N T (V k,n ΓVk,N T ) V k,n e = e T (Θ,0) (:k,:k ) [(Θ μ,ν ) (:k,:k ) ] (Θ,0 ) (:k,:k ) e = N e T [(Θ μ,ν) (:k,:k ) ] e. (A.4) Consider now α>0. Then τ<. For k N, by Lemma A.3, e T [(Θ μ,ν) (:k,:k ) ] e =N e T [(μπ + νh) (:k,:k )] e = Nω et [( τπ + H) (:k,:k )] e = Nω γ k τ (γ k γ k + + γ k + ), (A.5) where γ ± = τ ± τ. Since ζ = g,wehaveby(4.0) and (A.3) (A.5) diag(g)vk,n T u [ ] N = u () = g ωζ γ k γ k / + τ γ k + γ+ k. (A.6) We now compute ωζ τ. Let f(z)= T N (z/ω + τ).wehave N ζ = g = t tr = f (0) f(0) = T N (τ) ωt j= jn N (τ). Recall (.) and [7, p.37] ( t + ) N ( t t ) N t T N (t) = N, t to get T N (τ) = γ+ N + γ N and T N (τ) = N(γN + γ N )/ τ. Therefore, ωζ τ = N γ N γ + N γ N + γ + N. Eq. (A.6) now implies diag(g)vk,n T u [ = u () = g For τ = (β + α)/(β α) = ( + )/( ), γ k γ N + γ N + γ N γ N + γ k γ k = ( ) k Δ k, γ k + = ( ) k Δ (k ), γ k + + γ k + ] /. (A.7)

27 and therefore where R.-C. Li / Linear Algebra and its Applications 48 (008) diag(g)vk,n T u = u () = g ϱ k def = = [Δ k RHS of (A.8) [ + Δ (k ) ] [ (Δ k + Δ (k ) ) 4 [ = [ = (Δ k ΔN + Δ N Δ N Δ N =ϱ k [ Δ k Δ k Δ k + Δ (k ) Δ (k ) + Δ (k ) (ΔN + Δ N )(Δ(k ) Δ (k ) 4(Δ N Δ N + Δ (k ) )(Δ N (k ) Δ [N (k )] ] / (A.8) ], (A.9) ] / ) ) ] / ) Δ N Δ N ] (Δ (k ) + )(Δ [N (k )] / ) Δ N. Since Δ >, we have ( ) ( ) Δ(N ) Δ Δ N ϱk = Δ(k ) Δ [N (k )] Δ N, as expected. Eq. (4.3) for α = 0 can be obtained by exaing the proof above upon setting τ = and noticing T N ( )=( ) N, T N ( )= ( )N N, ζ =N /ω, and e T[(Θ μ,ν) (:k,:k ) ] e = k Nω. Proof of Theorem 4.7. Again we will use (4.0). Let Γ be as in the proof of Theorem 4.6. Then V k,n [diag( g )] V T k,n = V k,nγv T k,n = R T k where Θ μ,ν is as in Lemma A.. Set y = R T k e.wehave (Θ μ,ν ) (:k,:k) R k, e T (V k,n[diag(g)] V T k,n ) e =e T R k[(θ μ,ν ) (:k,:k) ] R T k e =y T [ N 4 Π (μπ + νh) Π ] (:k,:k) = 4 [ ] Nω zt ( τπ + H ) (:k,:k) z, where z = y except its first entry z () = y () /. Note y (j+) = T j (τ), and thus z = k 4 + T j (τ) = + Φ τ,k. j= Since g = N j= ω(cos θ jn τ) = Nωτ = N(α + β)/, we have by (4.0), y

28 830 R.-C. Li / Linear Algebra and its Applications 48 (008) ( u () = diag(g)v T k,n u g ) = β α 4 β + α [zt [( τπ + H) (:k,:k) ] z]. (A.0) When α>0, τ = ( + )/( ) < and 9 I N = ( τ )I N τπ + H ( τ +)I N = 4 I N, which gives 4 I N ( τπ + H) I N. Finally ( diag(g)v ( + ) z k,n T u ) u () = g + z, as was to be shown for (4.3). Now if α = 0, then τ =, T j (τ) = ( ) j, z = (/,,,,...) T, and (Π + H) (:k,:k) = LL T, where L R k k is lower bi-diagonal with the main diagonal entries and the off-diagonal entries. So the right-hand side of (A.0) is 4 L z 3 = k(4k ) and (4.33) now follows. Appendix B. Proof of Theorem 5.5 Lemma B.. Let ϑ kn = πk/n (0 k n), and l an integer. Then n N, if l = mn f or some integer m, cos lϑ kn = 0, if lis odd,, if l is even, but l /= mn f or any integer m. k=0 Proof. This is most likely well-known and can be verified by calculations. See also []. (B.) Lemma B. []. Let Γ = diag(μ + ν cos ϑ 0n,μ+ ν cos ϑ n,...,μ+ ν cos ϑ nn ) and define def Υ μ,ν = S N ΩΓS T N, where μ, ν C. We have Υ μ,ν = n 4 Ω (μω + νh)ω, (B.) where H is the same as the one in Lemma A.. 9 Better bounds can be obtained as follows. We note τπ + H = (τ + )Π + (Π + H), and thus [ ] [ ] ( τ ) + 4 sin (k + ) π I k ( τπ + H) (:k,:k) ( τ ) + 4 sin k (k + ) π I k because (Π + H) (:k,:k) is symmetric with eigenvalues [] 4 sin j (k+) π for j k. The improvements based upon this are more significantly for small k and become increasingly negligible for big k.

29 R.-C. Li / Linear Algebra and its Applications 48 (008) Proof of Theorem 5.5. We use (5.0). Let Γ = diag(τ0n tr,τtr n,...,τtr nn ) diag(μ + ν cos ϑ 0n,μ+ ν cos ϑ n,...,μ+ ν cos ϑ nn ), where μ = ωτ and ν = ω as in (.6). Then V k,n [diag( g )] Vk,N T =V k,nωγvk,n T =Rk T S k,n ΩΓS T k,n R k =Rk T (Υ μ,ν ) (:k,:k) Rk, e T (V k,n[diag( g )] Vk,N T ) e = et R k[(υ μ,ν ) (:k,:k) ] Rk T e = yt [ n 4 Ω (μω + νh)ω ] (:k,:k) y = nω zt [( τω + H) (:k,:k) ] z, where y = Rk Te, z = y except its first entry z () = y () / for k n and if k = N, z (N) = y (N) /, too. Since y (j+) = T j (τ), { 4 z = + k j= T } j (τ), if k n, 4 + n j= T j (τ) + 4 T N (τ) =, if k = N + Ψ τ,k + η. Since g = n j=0 ω(cos ϑ jn τ) (α + β) = Nωτ (α + β) = n(α + β),wehaveby (5.0), ( diag(g)vk,n T u ) = β α u () = g 4 β + α [zt [( τω + H) (:k,:k) ] z]. (B.3) When α>0, τ = ( + )/( ) <, and 0 I N = ( τ )I N τω + H ( τ +)I N = 4 I N, which gives 4 I N ( τω + H) I N. Finally ( ( + ) z u () = diag(g)vk,n T u ) g + z, as was to be shown for the case α>0. Now if α = 0 and k n, the right-hand sides of (A.0) and (B.3) are the same, and thus no additional proof is needed; if α = 0 and k = N, then the right-hand side of (B.3) is no longer well defined because Ω + H is singular, but its left-hand side is and it is equal to zero. To see this, we notice τnn tr with v (j+) (0 j n) being the coefficient of t j in n (τ tr φ n (t) = j=0 / n jn t) j=0 τ tr jn. = α = 0 and all other τ tr jn 0 As noted previously in the proof of Theorem 4.7, slightly better bounds can be obtained, too. > 0. Let v RN

30 83 R.-C. Li / Linear Algebra and its Applications 48 (008) Obviously v () = φ(0) =. It can be seen that diag(g)vn T v = 0 which implies the left-hand side of (B.3) is zero. References [] M. Abramowitz, I.A. Stegun (Eds.), Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, Dover Publications, Inc., New York, 970. [] B. Beckermann, On the numerical condition of polynomial bases: estimates for the condition number of Vandermonde, Krylov and Hankel matrices, Habilitationsschrift, Universität Hannover, April 996, < bbecker/abstract/habilitationsschrift_beckermann.pdf>. [3] B. Beckermann, The condition number of real Vandermonde, Krylov and positive definite Hankel matrices, Numer. Math. 85 (4) (000) [4] B. Beckermann, A.B.J. Kuijlaars, Superlinear convergence of conjugate gradients, SIAM J. Numer. Anal. 39 () (00) [5] R. Bhatia, Graduate Texts in Mathematics, Springer, New York, 996. [6] Å. Björck, V. Pereyra, Solution of Vandermonde systems of equations, Math. Comput. 4 () (970) [7] P. Borwein, T. Erdélyi, Graduate Texts in Mathematics, Springer, New York, 995. [8] J. Demmel, Applied Numerical Linear Algebra, SIAM, Philadelphia, 997. [9] V.N. Faddeeva, Computational Methods of Linear Algebra, Dover Publications, New York, 959 (translated from the Russian by Curtis D. Benster). [0] B. Fisher, Wiley-Teubner Series: Advances in Numerical Mathematics, John Wiley & Sons Ltd and B.G. Teubner, New York and Leipzig, 996. [] W.L. Frank, Solution of linear systems by Richardson s method, J. ACM 7 (3) (960) [] W. Gautschi, Norm estimates for inverses of Vandermonde matrices, Numer. Math. 3 (975) [3] W. Gautschi, The condition of Vandermonde-like matrices involving orthogonal polynomials, Linear Algebra Appl. 5/53 (983) [4] A. Greenbaum, Iterative Methods for Solving Linear Systems, SIAM, Philadelphia, 997. [5] N.J. Higham, Accuracy and Stability of Numerical Algorithms, SIAM, Philadelphia, 996. [6] I.C.F. Ipsen, Expressions and bounds for the GMRES residual, BIT 40 (3) (000) [7] S. Kaniel, Estimates for some computational techniques in linear algebra, Math. Comput. 0 (95) (966) [8] R.-C. Li, Norms of certain matrices with applications to variations of the spectra of matrices and matrix pencils, Linear Algebra Appl. 8 (993) [9] R.-C. Li, Asymptotically optimal lower bounds for the condition number of a real Vandermonde matrix, Technical Report , Department of Mathematics, University of Kentucky, 004. [0] R.-C. Li, Sharpness in rates of convergence for CG and symmetric Lanczos methods, Technical Report 005-0, Department of Mathematics, University of Kentucky. < math/mareport/> 005. [] R.-C. Li, Asymptotically optimal lower bounds for the condition number of a real Vandermonde matrix, SIAM J. Matrix Anal. Appl. 8 (3) (006) [] R.-C. Li, On Meinardus examples for the conjugate gradient method, Math. Comp. 77 (008) [3] J. Liesen, M. Rozlozník, Z. Strakoš, Least squares residuals and imal residual methods, SIAM J. Sci. Comput. 3 (5) (00) [4] J. Liesen, P. Tichý, The worst-case GMRES for normal matrices, BIT 44 () (004) [5] G. Meinardus, Über eine Verallgemeinerung einer Ungleichung von L. V. Kantorowitsch, Numer. Math. 5 (963) 4 3. [6] C.C. Paige, M.A. Saunders, Solution of sparse indefinite systems of linear equations, SIAM J. Numer. Anal. (4) (975) [7] Y. Saad, Iterative Methods for Sparse Linear Systems, second ed., SIAM, Philadelphia, 003. [8] Y. Saad, M. Schultz, GMRES: A generalized imal residual algorithm for solving nonsymmetric linear systems, SIAM J. Sci. Statist. Comput. 7 (986) [9] G.L.G. Sleijpen, A. van der Sluis, Further results on the convergence behavior of conjugate-gradients and Ritz values, Linear Algebra Appl. 46 (996) [30] L.N. Trefethen, D. Bau III, Numerical Linear Algebra, SIAM, Philadelphia, 997. [3] K. Zhou, J.C. Doyle, K. Glover, Robust and Optimal Control, Prentice Hall, Upper Saddle River, NJ, 995.

The Rate of Convergence of GMRES on a Tridiagonal Toeplitz Linear System

Numerische Mathematik manuscript No. (will be inserted by the editor) The Rate of Convergence of GMRES on a Tridiagonal Toeplitz Linear System Ren-Cang Li 1, Wei Zhang 1 Department of Mathematics, University