MODIFIED GRAM SCHMIDT (MGS), LEAST SQUARES, AND BACKWARD STABILITY OF MGS-GMRES

Size: px
Start display at page:

Download "MODIFIED GRAM SCHMIDT (MGS), LEAST SQUARES, AND BACKWARD STABILITY OF MGS-GMRES"

Transcription

1 SIAM J. MATRIX ANAL. APPL. Vol. 8, No., pp c 006 Society for Industrial and Applied Mathematics MODIFIED GRAM SCHMIDT (MGS), LEAST SQUARES, AND BACKWARD STABILITY OF MGS-GMRES CHRISTOPHER C. PAIGE, MIROSLAV ROZLOŽNÍK, AND ZDENĚK STRAKOŠ Abstract. The generalized minimum residual method (GMRES) [Y. Saad and M. Schultz, SIAM J. Sci. Statist. Comput., 7 (986), pp for solving linear systems Ax = b is implemented as a sequence of least squares problems involving Krylov subspaces of increasing dimensions. The most usual implementation is modified Gram Schmidt GMRES (MGS-GMRES). Here we show that MGS-GMRES is backward stable. The result depends on a more general result on the backward stability of a variant of the MGS algorithm applied to solving a linear least squares problem, and uses other new results on MGS and its loss of orthogonality, together with an important but neglected condition number, and a relation between residual norms and certain singular values. Key words. rounding error analysis, backward stability, linear equations, condition numbers, large sparse matrices, iterative solution, Krylov subspace methods, Arnoldi method, generalized minimum residual method, modified Gram Schmidt, QR factorization, loss of orthogonality, least squares, singular values AMS subject classifications. 65F0, 65F0, 65F5, 65F35, 65F50, 65G50, 5A, 5A4 DOI. 0.37/ Introduction. Consider a system of linear algebraic equations Ax = b, where A is a given n n (unsymmetric) nonsingular matrix and b a nonzero n-dimensional vector. Given an initial approximation x 0, one approach to finding x is to first compute the initial residual r 0 = b Ax 0. Using this, derive a sequence of Krylov subspaces K k (A, r 0 ) span{r 0,Ar 0,...,A k r 0 },k=,,..., in some way, and look for approximate solutions x k x 0 + K k (A, r 0 ). Various principles are used for constructing x k, which determine various Krylov subspace methods for solving Ax = b. Similarly, Krylov subspaces for A can be used to obtain eigenvalue approximations or to solve other problems involving A. Krylov subspace methods are useful for solving problems involving very large sparse matrices, since these methods use these matrices only for multiplying vectors, and the resulting Krylov subspaces frequently exhibit good approximation properties. The Arnoldi method [ is a Krylov subspace method designed for solving the eigenproblem of unsymmetric matrices. The generalized minimum residual method (GMRES) [0 uses the Arnoldi iteration and adapts it for solving the linear system Ax = b. GMRES can be computationally more expensive per step than some other methods; see, for example, Bi-CGSTAB [4 and QMR [9 for unsymmetric A, and LSQR [6 for unsymmetric or rectangular A. However, GMRES is widely used for solving linear systems arising from discretization of partial differential equations, and Received by the editors May, 005; accepted for publication (in revised form) by M. Benzi October 8, 005; published electronically March 7, School of Computer Science, McGill University, Montreal, Quebec, Canada, H3A A7 (paige@ cs.mcgill.ca). This author s work was supported by NSERC of Canada grant OGP Institute of Computer Science, Academy of Sciences of the Czech Republic, Pod Vodárenskou věží, 8 07 Praha 8, Czech Republic (miro@cs.cas.cz, strakos@cs.cas.cz). The work of these authors was supported by the project ET within the National Program of Research Information Society and by the Institutional Research Plan AV0Z Computer Science for the Information Society: Models, Algorithms, Applications. 64

2 ROUNDING ERROR ANALYSIS OF MGS-GMRES 65 as we will show, it is backward stable and it does effectively minimize the -norm of the residual at each step. The most usual way of applying the Arnoldi method for large, sparse unsymmetric A is to use modified Gram Schmidt orthogonalization (MGS). Unfortunately in finite precision computations this leads to loss of orthogonality among the MGS Arnoldi vectors. If these vectors are used in GMRES we have MGS-GMRES. Fortunately, experience suggests that MGS-GMRES succeeds despite this loss of orthogonality; see [. For this reason we examine the MGS version of Arnoldi s algorithm and use this to show that the MGS-GMRES method does eventually produce a backward stable approximate solution when applied to any member of the following class of linear systems with floating point arithmetic unit roundoff ɛ (σ means singular value): (.) Ax = b 0, A R n n, b R n, σ min (A) n ɛ A F ; see also the appendix. The restriction here is deliberately imprecise; see below. Moreover we show that MGS-GMRES gives backward stable solutions for its least squares problems at all iteration steps, thus answering important open questions. The proofs depend on new results on the loss of orthogonality and backward stability of the MGS algorithm, as well as the application of the MGS algorithm to least squares problems, and a lot of this paper is devoted to first obtaining these results. While the kth step of MGS produces the kth orthonormal vector v k, it is usual to say v k is produced by step k in the Arnoldi and MGS-GMRES algorithms. We will attempt to give a consistent development while avoiding this confusion. Thus step k of MGS-GMRES is essentially the kth step of MGS applied to [b, AV k to produce v k in [b, AV k =V k R k, where V k [v,...,v k and R k is upper triangular. In practice, if we reach a solution at step m of MGS-GMRES, then numerically b must lie in the range of AV m, so that B m [b, AV m is numerically rank deficient. But this means we have to show that our rounding error analysis of MGS holds for rank deficient B m and this requires an extension of some results in [5. In section we describe our notation and present some of the tools we need which may be of more general use. For example we show the importance of the condition number κ F (A) in (.), prove the existence of a nearby vector in Lemma.3, and provide a variant of the singular value residual norm relations of [7 in Theorem.4. In sections we review MGS applied to n m Bof rank m, and its numerical equivalence to the Householder QR reduction of B augmented by an m m matrix of zeros. In section 3.3 we show how the MGS rounding error results extend to the case of m>n, while in section 4 we show how these results apply to the Arnoldi algorithm. In section 5 we analyze the loss of orthogonality in MGS and the Arnoldi algorithm and how it is related to the near rank deficiency of the columns of B or its Arnoldi equivalent, refining a nice result of Giraud and Langou [0 and Langou [4. Section 6 introduces the key step used to prove convergence of these iterations. In section 7. we prove the backward stability of the MGS algorithm applied to solving linear least squares problems of the form required by the MGS-GMRES algorithm, and in section 7. we show how loss of orthogonality is directly related to new normwise relative backward errors of a sequence of different least squares problems, supporting a conjecture on the convergence of MGS-GMRES and its loss of orthogonality; see [8. In section 8. we show that at every step MGS-GMRES computes a backward stable solution for that step s linear least squares problem, and in section 8. we show that one of these solutions is also a backward stable solution for (.) in at most n+ MGS steps.

3 66 C. C. PAIGE, M. ROZLOŽNÍK, AND Z. STRAKOŠ The restriction on A in (.) is essentially a warning to be prepared for difficulties in using the basic MGS-GMRES method on singular systems; see, for example, [6, 3. The imprecise nature of the condition (using instead of > with some constant) was chosen to make the presentation easier. A constant could be provided (perhaps closer to 00 than 0), but since the long bounding sequence used was so loose, it would be meaningless. The appendix suggests that the form n ɛ A F might be optimal, but since for large n rounding errors tend to combine in a semirandom fashion, it is reasonable to replace n by n, and a more practical requirement than (.) might be (.) For large n, nɛ A F /σ min (A) 0... Notation and mathematical basics. We describe the notation we will use, together with some generally useful results. We use to mean is defined as in the first occurrence of an expression, but in any later occurrences of this expression it means is equivalent to (by earlier definition). A bar above a symbol will denote a computed quantity, so if V k is an ideal mathematical quantity, Vk will denote its actual computed value. The floating point arithmetic unit roundoff will be denoted by ɛ (half the machine epsilon; see [3, pp ), I n denotes the n n unit matrix, e j will be the jth column of a unit matrix I, sobe j is the jth column of B, and B i:j [Be i,...,be j. We will denote the absolute value of a matrix B by B, its Moore Penrose generalized inverse by B, F will denote the Frobenius norm, σ( ) will denote a singular value, and κ (B) σ max (B)/σ min (B); see (.) for κ F ( ). Matrices and vectors whose first symbol is Δ, such as ΔV k, will denote rounding error terms. For the rounding error analyses we will use Higham s notation [3, pp : c will denote a small integer whose exact value is unimportant ( c might have a different value at each appearance) and γ n nɛ/( nɛ), γ n cnɛ/( cnɛ). Without mentioning it again, we will always assume the conditions are such that the denominators in objects like this (usually bounds) are positive; see, for example, [3, (9.6). We see γ n /( γ n )= cnɛ/( cnɛ), and might write γ n /( γ n )= γ n for mathematical correctness, but will refer to the right-hand side as γ n thereafter. E m, Ê m, Ẽ m will denote matrices of rounding errors (see just before Theorem 3.3), and E m e j γ Be j implies this holds for j =,...,m unless otherwise stated. Remark. (see also the appendix). An important idea used throughout this paper is that column bounds of the above form lead to several results which are independent of column scaling, and we take advantage of this by using the following condition number. Throughout the paper, D will represent any positive definite diagonal matrix. The choice of norms is key to making error analyses readable, and fortunately there is a compact column-scaling-independent result with many uses. Define (.) κ F (A) min AD F /σ min (AD). diagonal D>0 This condition number leads to some useful new results. Lemma.. If E and B have m columns, then for any positive definite diagonal matrix D: Ee j γ Be j, j =,...,m, ED F γ BD F ; Ee j γ Be j for j =,...,m and rank(b) =m EB F γ κ F (B). With the QR factorization B = Q R, this leads to ER F γ κ F (B) =γ κ F (R). Proof. Ee j γ Be j implies EDe j γ BDe j so ED F γ BD F. For B of rank m, (BD) = D B, (BD) = σ min (BD), and so EB F = ED(BD) F ED F (BD) γ BD F /σ min (BD).

4 ROUNDING ERROR ANALYSIS OF MGS-GMRES 67 Since this is true for all such D, we can take the minimum, proving our results. Lemma.. If m m R is nonsingular and P T P = I in P R = B + E, and γ κ F (B) <, then Ee j γ Be j, j =,...,m, E R F γ κ F (B)/( γ κ F (B)). Proof. For any D in (.), Ee j γ Be j ED F γ BD F, and then σ min ( RD) σ min (BD) γ BD F,so E R F = ED( RD) F is bounded by ED F ( RD) γ BD F = γ BD F /σ min (BD) σ min (BD) γ BD F γ BD F /σ min (BD). Taking the minimum over D proves the result. Suppose V m [ v,..., v m isann m matrix whose columns have been computationally normalized to have -norms of, and so have norms in [ γ n, + γ n. Now define Ṽm [ṽ,...,ṽ m where ṽ j is just the correctly normalized version of v j,so (.) V m = Ṽm(I +Δ m ), Δ m diag(ν j ), where ν j γ n, j =,...,m; V m T V m = Ṽ m T Ṽm + Ṽ m T Ṽm.Δ m +Δ m.ṽ m T Ṽm +Δ m.ṽ m T Ṽm.Δ m, V m T V m Ṽ m T Ṽm F / Ṽ m T Ṽm F γ n ( + γ n ) γ n. From now on we will not document the analogues of the last step γ n ( + γ n ) γ n but finish with γ n. In general it will be as effective to consider Ṽm as V m, and we will develop our results in terms of Ṽm rather than V m. The following will be useful here: (.3) [Ṽm,I n = I n +ṼmṼ H m =+ ṼmṼ H m =+ Ṽm + Ṽm F =+m. Lemma.3 deals with the problem: Suppose we have d R n and we know for some unknown perturbation f R (m+n) that 0 d +f = ρ. Is there a perturbation g of the same dimension as d, and having a similar norm to that of f, such that d + g = ρ also? Here we show such a g exists in the form g = Nf, N. Lemma.3. For a given d R n and unknown f R (m+n),if f 0 p + f = pρ ρ, where p d + f d p =, then there exists 0 σ, v R n with v =, and n (m+n) N of the form (.4) (.5) (.6) so that This gives N [v( + σ) p T,I n, d + Nf = vρ. 0 [ + f d = d + Nf = ρ, N. Proof. Define σ p. If σ = 0 take any v R n with v =. Otherwise define v p /σ so v =. In either case p = vσ and p T p = σ. Now define N as in (.4), so d + Nf = d + v( + σ) p ρ + f = p ρ + v( σ)ρ = vρ, NN T = I + v( + σ) ( σ )v T, N = NN T =+( σ)/( + σ), proving (.5) and (.6).

5 68 C. C. PAIGE, M. ROZLOŽNÍK, AND Z. STRAKOŠ This is a refinement of a special case of [5, Lem. 3.; see also [3, Ex. 9.. The fact that the perturbation g in d has the form of N times the perturbation f is important, as we shall see in section 7.. Finally we give a general result on the relation between least squares residual norms and singular values. The bounds below were given in [7, Thm. 4. but subject to the condition [7, (.4) that we cannot be sure will hold here. To prove that our results here hold subject to the different condition (.), we need to prove a related result. In order not to be too repetitive, we will prove a slightly more general result than we considered before, or need here, and make the theorem and proof brief. Theorem.4. Let B R n k have rank s and singular values σ σ s > 0. For 0 c R n and a scalar φ 0, define ŷ B c, ˆr c Bŷ, σ(φ) σ s+ ([cφ, B), and δ(φ) σ(φ)/σ s.ifˆrφ 0, then σ(φ) > 0. Ifφ 0 σ s / c, then for all φ [0,φ 0 ) we have 0 δ(φ) <. Finally, for all φ>0 such that δ(φ) <, we have [ σ (φ)[φ + ŷ ˆr σ (φ) φ + ŷ δ. (φ) Proof. ˆr is the least squares residual for By c, soˆrφ 0 means [cφ, B has rank s+ and σ(φ) > 0. If 0 φ<φ 0, then cφ < cφ 0 = σ s, so via Cauchy s interlacing theorem, 0 σ(φ) σ s+ ([cφ, B) <σ s, giving 0 δ(φ) <. Using the singular value decomposition B = W diag(σ, 0)Z T, W T = W, Z T = Z, write W T [c, BZ = a Σ 0, Σ σ a 0 0 σ s, a α α s [ Σ, ŷ = Z a 0 Then it can be shown (see, for example, [6, (39.4), [7, (.6), [5, pp , et al.) that for all φ such that φ>0 and δ(φ) <, σ(φ) is the smallest root of s ˆr = σ(φ) [φ αi + /σ i σ(φ) /σ. i= i s But ŷ α s i α s i = σi /σ i αi σ(φ) /σi /σ i σ(φ) /σs i= i= i= = ŷ δ (φ) while δ(φ) σ(φ)/σ s <, and the result follows. We introduced φ 0 to show δ(φ) < for some φ > 0. For results related to Theorem.4 we refer to [5, pp , which introduced this useful value φ The modified Gram Schmidt (MGS) algorithm. In order to understand the numerical behavior of the MGS-GMRES algorithm, we first need a very deep understanding of the MGS algorithm. Here this is obtained by a further study of the numerical equivalence between MGS and the Householder QR factorization of an augmented matrix; see [5 and also, for example, [3, section 9.8. We do not give exact bounds but work with terms of the form γ n instead; see [3, pp and our section. The exact bounds will not even be approached for the large n we are interested in, so there is little reason to include such fine detail. In sections we will review the MGS-Householder equivalence and extend some of the analysis that was given in [5 and [3, section 9.8..

6 ROUNDING ERROR ANALYSIS OF MGS-GMRES The basic MGS algorithm. Given a matrix B R n m with rank m n, MGS in theory produces V m and nonsingular R m in the QR factorization (3.) B = V m R m, V T m V m = I m, R m upper triangular, where V m [v,...,v m, and m mr m (ρ ij ). The version of the MGS algorithm which immediately updates all columns computes a sequence of matrices B = B (),B (),...,B (m+) = V m R n m, where B (i) =[v,...,v i,b (i) i,...,b (i) m. Here the first (i ) columns are final columns in V m, and b (i) i,...,b (i) m have been made orthogonal to v,...,v i. In the ith step we take (3.) ρ ii := b (i) i 0 since rank(b) =m, v i := b (i) i /ρ ii, and orthogonalize b (i) i+,...,b(i) m against v i using the orthogonal projector I v i v T i, (3.3) ρ ij := vi T b (i) j, b(i+) j := b (i) j v i ρ ij, j = i +,...,m. We see B (i) = B (i+) R (i), where R (i) has the same ith row as R m but is the unit matrix otherwise. Note that in the mth step no computation is performed in (3.3), so that after m steps we have obtained the factorization (3.4) B = B () = B () R () = B (3) R () R () = B (m+) R (m) R () = V m R m, where in exact arithmetic the columns of V m are orthonormal by construction. This formed R m a row at a time. If the jth column of B is only available after v j is formed, as in MGS-GMRES, then we usually form R m a column at a time. This does not alter the numerical values if we produce ρ,j, b () j ; ρ,j, b (3) j ; etc. It was shown in [3 that for the computed R m and V m in MGS (3.5) B + E = V m Rm, E c (m, n)ɛ B, I V T m V m c (m, n)ɛκ (B), where c i (m, n) denoted a scalar depending on m, n and the details of the arithmetic. We get a deeper understanding by examining the MGS-Householder QR relationship. 3.. MGS as a householder method. The MGS algorithm for the QR factorization of B can be interpreted as an orthogonal transformation applied to the matrix B augmented with a square matrix of zero elements on top. This is true in theory for any method of QR factorization, but for Householder s method it is true in the presence of rounding errors as well. This observation was made by Charles Sheffield and relayed to the authors of [5 by Gene Golub. First we look at the theoretical result. Let B R n m have rank m, and let O m R m m be a zero matrix. Consider the QR factorization (3.6) B Om R P P = P B m R, P 0 P P [ T 0 m = Pm. Since B has rank m, P is zero, P is an n m matrix of orthonormal columns, and, see (3.), B = V m R m = P R. If upper triangular R m and R are both chosen to have positive diagonal elements in B T B = R T mr m = R T R, then R m = R by uniqueness, so P = V m can be found from any QR factorization of the augmented matrix B.

7 70 C. C. PAIGE, M. ROZLOŽNÍK, AND Z. STRAKOŠ The last n columns of P m are then arbitrary up to an n n orthogonal multiplier, but in theory the Householder reduction produces, see [5, (.7) (.8), the (surprisingly symmetric) orthogonal matrix (3.7) Om Vm P m = T V m I V m Vm T, showing that in this case P m is fully defined by V m. A crucial result for this paper is that the Householder QR factorization giving (3.6) is also numerically equivalent to MGS applied to B. A close look at this Householder reduction, see, for example, [5, (.6) (.7), shows that for the computed version (3.8) P m T P (m) P (), P (j) = I p j p T ej j, p j =, j =,...,m, v j where the v j are numerically identical to the computed v j in (3.), so for example after the first two Householder transformations, our computed equivalent of P () () P B is ρ ρ ρ 3 ρ m 0 ρ ρ 3 ρ m (3.9)...., b(3) 3 b(3) m b (j) k where the ρ jk and are also numerically identical to the corresponding computed (j) values in (3.) and (3.3). That is, in practical computations, the v j, ρ jk, and b k are identical in both algorithms; see [5, p. 79. Note that the jth row of R m is completely formed in the jth step and not touched again, while is eliminated MGS applied to n m B with m > n. The paper [5 was written assuming that m n and n mbin (3.) had rank m, but it was mentioned in [5, p. 8 that the rank condition was not necessary for proving the equivalence mentioned in the last paragraph of section 3. above. For computations involving n mbwith m>n, Householder QR on B will stop in at most n steps, but both MGS on B, and Householder QR on B in (3.6), can nearly always be carried on for the full m steps. The MGS-Householder QR equivalence also holds for m>n, since the MGS and augmented Householder methods, being identical theoretically and numerically, either both stop with some ρ kk =0,k<m, see (3.), or both carry on to step m. It is this m>n case we need here, and we extend the results of [5 to handle this. Because of this numerical equivalence, the backward error analysis for the Householder QR factorization of the augmented matrix in (3.6) can also be applied to the MGS algorithm on B. Two basic lemmas contribute to Theorem 3.3 below. Lemma 3.. In dealing with Householder transformations such as (3.8), Wilkinson [6, section 4. pointed out that it is perfectly general to analyze operations with P = I pp T for p having no zero elements. (This means we can drop the zero elements of p and the corresponding elements of the unit matrix and vector that P is applied to. In (3.8) each p has at most n+ nonzero elements that we need to consider.) b (j) j

8 ROUNDING ERROR ANALYSIS OF MGS-GMRES 7 Lemma 3. (see [3, Lem. 9.3). In practice, if j Householder transformations are applied to a vector b R n, the computed result c satisfies c = P j P P (b +Δb), Δb j γ n b. In Theorem 3.3, E m will refer to rounding errors in the basic MGS algorithm, while later Êm will refer to errors in the basic MGS algorithm applied to solving the equivalent of the MGS-GMRES least squares problem, and Ẽm will refer to errors in the MGS-GMRES algorithm. All these matrices will be of the following form: (3.0) E m R (m+n) m, E E m m }m }n. E m Theorem 3.3. Let R m and V m =[ v,..., v m be the computed results of MGS applied to B R n m as in (3.) (3.4), but now allow m>n.for j =,...,m, step j computes v j and the jth row of R (j+) (j+) m and b j+,..., b m (see (3.9)). Define ej p j =, P (j) = I p v j p T j, Pm = P () () (3.) P P (m), j ej ṽ j = v j / v j, p j =, P (j) = I p j p T j, Pm = P () () P P (m). ṽ j Then P (j) is the orthonormal equivalent of the computed version P (j) of the Householder matrix applied in the jth step of the Householder QR factorization of B in (3.6), so that P m T P m = I, and for the computed version R m of R = R m in (3.6), and any positive definite diagonal matrix D, see Lemma. (here j =,...,m), Rm E (3.) P m = m 0 B + E m ; Pm orthogonal; Rm,E m R m m ; E E m m ; E m e j j γ n Be j, E m D F m γ n BD F ; (3.3) (3.4) (3.5) E m R m e j Be j + E m e j ( + j γ n ) Be j ; E me =0, E me j j γn Be j, j =,...,m; E md F m γn (BD) :m F ; [ S P m = m (I S m )Ṽ m T Ṽ m (I S m ) I Ṽm(I S T m )Ṽ m T, Pm P m = I, where m m E m and S m are strictly upper triangular. The jth row of E m is wholly produced in step j, just as the jth row of R m is. The jth column of S m is not defined until step j and is not altered thereafter. (If MGS stops with ρ kk =0,see(3.), rows k,...,mof R m and E m are zero, and columns k,...,mof V m and S m are nonexistent, so we replace m above by k.) Proof. The MGS-augmented Householder QR equivalence for the case of m n was proven in [5, and that this extends to m>nis proven in the first paragraph of section 3.3. As a result we can apply Lemmas 3. and 3. to give (3.) (3.3). The ideal P in (3.6) has the structure in (3.7), but it was shown in [5, Thm. 4., and (4.5) (which did not require n m in our notation) that P m in (3.) and (3.) has the extremely important structure of (3.5) for some strictly upper triangular m m S m. Since E m = S m Rm, this is strictly upper triangular too.

9 7 C. C. PAIGE, M. ROZLOŽNÍK, AND Z. STRAKOŠ The rest follow with Lemmas 3. and 3.. We have used γ n = γ n+ rather than γ m+n because in each step, p j in (3.) has only n+ elements; see (3.9) and Lemma 3.. Row j in R m is not touched again after it is formed in step j, see (3.9), and so the same is true for row j in E m in (3.); see Lemma 3.. Since E m = S m Rm, the jth column of S m is not defined until ρ jj is computed in step j, and since these three matrices are all upper triangular, it is not altered in later steps. Finally we obtain new bounds in (3.4). The element ρ ij is formed by the one transformation P (i) in (3.) applied to (E m) ii =0) b (i) j in (3.9), and so from Lemma 3. we can say (remember (E m) ij γ n b (i) j γ n Be j, j = i+,...,m, which is quite loose but leads to the bounds in (3.4). Note that (3.4) involves j, rather than the j in previous publications. Remark 3.. It is counterintuitive that E m is strictly upper triangular, so we will explain it. We need only consider the first augmented Householder-MGS transformation of the first vector to form ρ in (3.9). We can rewrite the relevant part of the first transformation ideally as, see (3.) and Lemma 3., [ 0 ρ 0 v T P =, P = b 0 v I vv T, b = vρ, v =. From b we compute ρ and v and then define ṽ v/ v so ṽ =. In order for E me = 0 in (3.), there must exist a backward error term Δb such that 0 ṽ T 0 [ ρ ṽ I ṽṽ T =, b +Δb 0 which looks like n + conditions on the n-vector Δb. But multiplying throughout by P shows there is a solution Δb =ṽ ρ b. The element above Δb is forced to be zero, so that there are actually n+ conditions on n+ unknowns. An error analysis (see Lemma 3.) then bounds Δb γ n b. 4. The Arnoldi algorithm as MGS. The Arnoldi algorithm [ is the basis of MGS-GMRES. We assume that the initial estimate of x in (.) is x 0 = 0, so that the initial residual r 0 = b, and use the Arnoldi algorithm with ρ b, v b/ρ, to sequentially generate the columns of V k+ [v,...,v k+ via the ideal process: (4.) AV k = V k H k,k + v k+ h k+,k e T k = V k+ H k+,k, Vk+V T k+ = I k+. Here k kh k,k =(h ij ) is upper Hessenberg, and we stop at the first h k+,k =0. Because of the orthogonality, this ideal algorithm must stop for some k n. Then AV k = V k H k,k, where H k,k has rank at least k. If h k+,k = 0 and H k,k has rank k, there exists a nonzero z such that AV k z = V k H k,k z = 0, so that A must be singular. Thus when A is nonsingular so is H k,k, and so in MGS-GMRES, solving H k,k y = e ρ and setting x = V k y solves (.). But if A is singular, this might not provide a solution even to consistent Ax = b: [ 0 0 A =, x =, v 0 0 = b = Ax = [ 0, AV = V H,, H, =0. Thus it is no surprise that we will require a restriction of the form (.) to ensure that the numerical MGS-GMRES algorithm always obtains a meaningful solution.

10 ROUNDING ERROR ANALYSIS OF MGS-GMRES 73 To relate the Arnoldi and MGS-GMRES algorithms to the MGS algorithm, we now replace k + by m and say that in the mth MGS step these produce v m, and MGS-GMRES also produces the approximation x m = V m y m to the solution x of (.). Then apart from forming the Av j, the algorithm we use to give (4.) is identical to (3.) (3.3) with the same vectors v j, and b b, ρ ρ; and for j =,...,m, b j+ Av j,ρ i,j+ h i,j i=,...,j+, except that Av j cannot be formed and orthogonalized against v,...,v j until v j is available. This does not alter the numerical values. Thus with upper triangular R m, (4.) B m A [x, V m =[b, AV m =V m [e ρ, H m,m V m R m, V T m V m = I. So in theory the Arnoldi algorithm obtains the QR factorization of B m [b, AV m by applying MGS to B m. Computationally we can see that we have applied MGS to B m [b, fl(a V m ), where V m [ v,..., v m is the matrix of supposedly orthonormal vectors computed by MGS, and see, for example, [3, section 3.5, fl(a v j )=(A+ΔA j ) v j, ΔA j γ n A, so fl(a V m )=AṼm +ΔV m, (4.3) ΔV m γ n A. V m, ΔV m F m γn A m γn A F, gives the computed version of A V m. We could replace n by the maximum number of nonzeros per row, while users of preconditioners, or less simple multiplications, could insert their own bounds on ΔV m here. Remark 4.. The bounds in (4.3) are not column-scaling independent. Also any scaling applies to the columns of A V m, not to A, and so would not be of such an advantage for MGS-GMRES as for ordinary MGS. Therefore it would seem important to ensure the columns of A are reasonably scaled for MGS-GMRES e.g., to approach the minimum over positive diagonal D of AD F /σ min (AD); see the appendix. The rounding error behavior of the Arnoldi algorithm is as follows. Theorem 4.. For the computational version of the Arnoldi algorithm (4.) (with m k +) with floating point arithmetic unit roundoff ɛ producing V m and R m [e ρ, H m,m,see(4.), there exists an n+m square orthogonal matrix P m of the form (3.5) where Ṽm is V m with its columns correctly normalized, such that if (4.4) B m [b, fl(a V m )=[b, AṼm +[0, ΔV m, where we can use the bounds on ΔV m in (4.3), then all the results of Theorem 3.3 apply when B there is replaced by B m here. Thus whatever we say for MGS will hold for the Arnoldi algorithm if we simply replace B by B m [b, fl(a V m )=[b, AṼm +[0, ΔV m. The key idea of viewing the Arnoldi algorithm as MGS applied to [b, AV n appeared in [5. It was used in [8 and [, and in particular in [8, in which we outlined another possible approach to backward stability analysis of MGS-GMRES. Here we have chosen a different way of proving the backward stability result, and this follows the spirit of [5 and [0. 5. Loss of orthogonality of V m from MGS and the Arnoldi algorithm. The analysis here is applicable to both the MGS and Arnoldi algorithms. B will denote the given matrix in MGS, or B m [b, fl(a V m ) in the Arnoldi algorithm. Unlike [0, 4, we do not base the theory on [5, Lem. 3., since a direct approach is cleaner and gives nicer results. It is important to be aware that our bounds will

11 74 C. C. PAIGE, M. ROZLOŽNÍK, AND Z. STRAKOŠ be of a different nature to those in [0, 4. Even though the rounding error analysis of MGS in [0, 4 is based on the ideas in [5, the bounds obtained in [0 and [4, pp are unexpectedly strong compared with our results based on [5. This is because [0, (8) (9) and [4, (.68) (.69) leading to [0, Thm. 3. and [4, Thm..4. follow from [6, p. 60, (45.3). But in Wilkinson [6, (45.3) follows from his (45.), (45.), and (44.6), where this last is clearly for fl arithmetic (double precision accumulation of inner products). Since double precision is used in [0, 4, their analysis is essentially assuming what could be called fl 4 quadruple precision accumulation of inner products. This is not stated in [0, 4, and the result is that their bounds appear to be much better (tighter) and the conditions much easier (less strict) than those that would have been obtained using standard floating point arithmetic. We will now obtain refined bounds based on our standard floating point arithmetic analysis and attempt to correct this misunderstanding. Remark 5.. The γ n in each expression in (3.) (3.4) is essentially the same γ n, that from Lemma 3., so we will call it ˆγ n. We could legitimately absorb various small constants into a series of new γ n, but that would be less transparent, so we will develop a sequence of loose bounds based on this fixed ˆγ n. To simplify our bounds, we use { } to mean under the assumption that mˆγ n κ F (B) /8. Note that this has the following consequences: (5.) mˆγ n κ F (B) /8 {( mˆγ n κ F (B)) 8/7 & μ m ˆγn κ F (B)8/7 /7 & ( + μ)/( μ) 4/3}. The basic bound is for S m = E m R m ; see (3.), (3.5). This is part of an orthogonal matrix so S m. From (3.) and (3.4) for any m m diagonal matrix D>0, S m F = E md( R m D) F E md F ( R m D) = E md F /σ min ( R m D) (5.) (5.3) E md F m ˆγ n (BD) :m F, σ min (BD) E m D σ min (BD) mˆγ n BD F S m F m ˆγn κ F (B)/( mˆγ n κ F (B)) { } 8 7 m ˆγn κ F (B) { } 7, with obvious restrictions. The bounds (5.3) took a minimum over D. V m [ v,..., v m isthen m matrix of vectors computed by m steps of MGS, Ṽ m [ṽ,...,ṽ m is the correctly normalized version of V m,soṽm satisfies (.) (.3). Since I S m is nonsingular upper triangular, the first m rows of P m in (3.5) give (5.4) (5.5) (I S m )Ṽ T m Ṽm(I S m ) T = I S m ST m =(I S m )(I S m ) T +(I S m ) S T m + S m (I S m ) T, Ṽm T Ṽm = I + S m(i T S m ) T +(I S m ) Sm, (I S m ) Sm = S m (I S m ) = strictly upper triangular part(ṽ m T Ṽm). Since Ṽ m ṽ T m is the above diagonal part of the last column of symmetric Ṽ m T Ṽm I, (5.5) and (5.3) give the key bound (at first using mˆγ n κ F (B)<; see (5.)), (5.6) Ṽm ṽ T m I Ṽ m T Ṽm F = (I S m ) Sm F S m F /( S m ) (m) ˆγn κ F (B)/[ (m+m )ˆγn κ F (B), { } 4 3 (m) ˆγn κ F (B) (cf. [3, 5, (5.3)),

12 ROUNDING ERROR ANALYSIS OF MGS-GMRES 75 and similarly for V m ; see (.). This is superior to the bound in [5, but the scaling idea is not new. Higham [3, p. 373 (and in the 996 first edition) argued that κ (B) in [5, 3, see (3.5), might be replaced by the minimum over positive diagonal matrices D of κ (BD), which is almost what we have proven using κ F (B) in (.). One measure of the extent of loss of orthogonality of Ṽm is κ (Ṽm). Lemma 5.. If Ṽ T m Ṽm = I + F m + F T m with strictly upper triangular F m and S m in F m S m (I S m ),see(5.4), then for all singular values σ i (Ṽm) S m + S m σ i (Ṽm) + S m S m, κ (Ṽm) + S m S m. Proof. Obviously F m S m /( S m ). For any y R k such that y =, Ṽmy = +y T Fm y + F m (+ S m )/( S m ), which gives the upper bound on every σ i (Ṽm). From (5.4) (I S m )Ṽ T m Ṽm(I S m ) T = I S m ST m, so for any y R k such that y =, define z (I S m ) T y so z + S m and then z T Ṽ T m Ṽmz z T z = yt Sm ST m y z T z S m ( + S m ) = S m + S m, giving the lower bound on every σ i (Ṽm). The bound on κ (Ṽm) follows. Combining Lemma 5. with (5.) and (5.3) gives the major result (5.7) for j =,...,m, jˆγ n κ F (B j ) /8 S j F /7 κ (Ṽj), σ min (Ṽj), σmax(ṽj) 4/3. At this level the distinction between κ ( V m ) and κ (Ṽm) is miniscule, see (.), and by setting j = m we can compare this with the elegant result which was the main theorem of Giraud and Langou [0; see [4, Thm..4.. Theorem 5. (see [0, Thm. 3.; 4, Thm..4.). Let B R n m be a matrix with full rank m n and condition number κ (B) such that (5.8).(m +)ɛ<0.0 and 8.53m 3 ɛκ (B) 0.. Then MGS in floating point arithmetic (present comment in 005: actually fl,or fl 4 if we use double precision) computes V m R n m as κ ( V m ).3. Note that the conditions (5.8) do not involve the dimension n of each column of V m, and this is the result of their analysis using fl. We can assume m satisfying the second condition in (5.8) will also satisfy the first. To compare Theorem 5. with j = m in (5.7), note that m γ n essentially means cmnɛ for some constant c >, probably less than the 8.53 in Theorem 5.. We assumed standard (IEEE) floating point arithmetic, but if we had assumed fl arithmetic, that would have eliminated the n from our condition in (5.7). We used (.), which involves BD F m BD. If we inserted this upper bound, that would mean our condition would be like that in Theorem 5., except we have the optimal result over column scaling; see (.). So if the same arithmetic is used, (5.7) is more revealing than Theorem 5.. It is worth noting that with the introduction of XBLAS [7, the fl and fl 4 options may become available in the near future.

13 76 C. C. PAIGE, M. ROZLOŽNÍK, AND Z. STRAKOŠ 6. A critical step in the Arnoldi and MGS-GMRES iterations. It will simplify the analysis if we use (5.7) to define a distinct value ˆm of m. This value will depend on the problem and the constants we have chosen, but it will be sufficient for us to prove convergence and backward stability of MGS-GMRES in ˆm n steps. For the ordinary MGS algorithm remember B m = B m, and think of m as increasing. (6.) Let ˆm be the first integer such that κ (Ṽ ˆm) > 4/3 then we know from (5.7) that for B ˆm in the Arnoldi algorithm, see (4.4) and (.), (6.) ˆmˆγ n κ F ( B ˆm ) > /8, so σ min ( B ˆm D) < 8ˆmˆγ n B ˆm D F diagonal D>0. But since σ min (Ṽj) σ (ṽ )= ṽ = σ max (Ṽj), (6.) also tells us that (6.3) κ (Ṽj), σ min (Ṽj), σ max (Ṽj) 4/3, j =,..., ˆm. The above reveals the philosophy of the present approach to proving backward stability of MGS-GMRES. Other approaches have been tried. Here all is based on κ F ( B m ) rather than the backward error or residual norm. In [, Thm. 3., p. 73 a different approach was taken the assumption was directly related to the norm of the residual. The present approach leads to very compact and elegant formulations, and it is hard to say now whether the earlier approaches (see [8) would have succeeded. 7. Least squares solutions via MGS. The linear least squares problem (7.) ŷ arg min y b Cy, ˆr b Cŷ, C R n (m ), may be solved via MGS in different ways. Here we discuss two of these ways, but first we remind the reader how this problem appears in MGS-GMRES with C = AV m. After carrying out step m of the Arnoldi algorithm as in section 4 to produce [b, AV m =V m R m, see (4.), the MGS-GMRES algorithm in theory minimizes the -norm of the residual r m = b Ax m over x m x 0 + K m (A, r 0 ), where for simplicity we are assuming x 0 = 0 here. It does this by using V m from (4.) to provide an approximation x m V m y m to the solution x of (.). Then the corresponding residual is (7.) [ r m b Ax m =[b, AV m = V y m R m m where R m [e ρ, H m,m. The ideal least squares problem is (7.3) y m = arg min [b, AV m y y, y m but (in theory) the MGS-GMRES least squares solution is found by solving (7.4) y m arg min R m y y. 7.. The MGS least squares solution used in MGS-GMRES. If B = [C, b in (3.) (3.4), and C has rank m, then it was shown in [5, (6.3), see also [3, section 0.3, that MGS can be used to compute ŷ in (7.) in a backward stable way. Here we need to show that we can solve (7.) in a stable way with MGS applied,

14 ROUNDING ERROR ANALYSIS OF MGS-GMRES 77 to B =[b, C (note the reversal of C and b) in order to prove the backward stability of MGS-GMRES. Just remember B =[b, C B m in (4.4) for MGS-GMRES. The analysis could be based directly on [5, Lem. 3., but the following is more precise. Let MGS on B in (3.) lead to the computed R m (we can assume R m is nonsingular; see later) satisfying (3.), where B =[b, C. Then (3.) and (7.) give (7.5) (7.6) P m [ Rm 0 = [ 0 [b, C ŷ arg min B y + E m ; E m e j j γ n [b, Ce j, j =,...,m, y, ˆr = B. ŷ To solve the latter computationally, having applied MGS to B to give R m,we carry out a backward stable solution of min R (7.7) m y y by orthogonal reduction followed by the solution of a triangular system. With (3.3) we will see this leads to [ t (7.8) ˆQ T ( R m +ΔR m )= Ū +ΔU τ 0, (Ū +ΔU)ȳ = t, ΔR m e j γ m Re j γ m Be j = γ m [b, Ce j, j =,...,m, where ˆQ is an orthogonal matrix while τ, t, nonsingular upper triangular Ū, and ȳ are computed quantities. Here ΔU is the backward rounding error in the solution of the upper triangular system to give ȳ, see, for example, [3, Thm. 8.3, and ΔR m was obtained by combining ΔU with the backward rounding error in the QR factorization that produced τ, t and Ū; see, for example, [3, Thm. 9.0 (where here there are m stages, each of one rotation). Clearly ȳ satisfies [ ȳ = arg min y ( R (7.9) m +ΔR m ). y In order to relate this least squares solution back to the MGS factorization of B, we add the error term ΔR m to (7.5) to give (replacing j γ n + γ m by j γ n ) ( Rm +ΔR P m ) 0 m = + 0 [b, C Êm, Ê m E m + P ΔRm (7.0) m, 0 Ême j j γ n [b, Ce j, j =,...,m. Now we can write for any y R m [ r =r(y) b Cy, p=p(y) P ( Rm +ΔR m ) (7.) m 0 y [ 0 = +Êm, r y and we see from (.6) in Lemma.3 that for any y R m there exists N(y) so that p(y) = ( R m +ΔR m ) = b Cy + N(y)Êm, N(y) y y. Defining [Δb(y), ΔC(y) N(y)Êm shows that for all y R m [ ( R (7.) m +ΔR m ) = b +Δb(y) [C +ΔC(y)y y.

15 78 C. C. PAIGE, M. ROZLOŽNÍK, AND Z. STRAKOŠ Thus ȳ in (7.9) also satisfies (7.3) ȳ = arg min b+δb(y) [C +ΔC(y)y, y [Δb(y), ΔC(y)e j j γ n [b, Ce j, j =,...,m, where the bounds are independent of y, so that ȳ is a backward stable solution for (7.). That is, MGS applied to B =[b, C followed by (7.7) is backward stable as long as the computed R m from MGS is nonsingular (we can stop early to ensure this). The almost identical analysis and result applies wherever b is in B, but we just gave the B =[b, C case for clarity. Since we have a backward stable solution ȳ, we expect various related quantities to have reliable values, and we now quickly show two cases of this. If E F γ B F, then Ey = i et i Ey i et i E y = E F y γ B F y. So from the bounds in (7.0) we have for any y R m the useful basic bound [ (7.4) y Êm γ mn ψ m (y), ψ m (y) b + C F y. Multiplying (7.8) and (7.0) on the right by (7.5) r b Cȳ, ˆQem τ Pm = 0 ȳ shows that the residual r satisfies 0 + r Êm, r ȳ τ γ mn ψ m (ȳ), so that τ approximates r with a good relative error bound. Multiplying the last equality in this on the left by [Ṽm,I n, and using (3.5), (3.), (7.0), (7.8), (3.4), and (.3) with the argument leading to (7.4), we see that (7.6) Ṽ m ˆQem τ = r +[Ṽm,I n Êm [ ȳ = r +[Ṽm(E m +ΔR m )+E m r Ṽm ˆQe m τ γ mn ψ m (ȳ) for m< ˆm in (6.)., ȳ Thus V m ˆQem τ also approximates r b Cȳ with a good relative error bound; see (.) and its following sentence. 7.. Least squares solutions and loss of orthogonality in MGS. An apparently strong relationship was noticed between convergence of finite precision MGS- GMRES and loss of orthogonality among the Arnoldi vectors; see [, 9. It was thought that if this relationship was fully understood, we might use it to prove that finite precision MGS-GMRES would necessarily converge; see, for example, [8. A similar relationship certainly does exist it is the relationship between the loss of orthogonality in ordinary MGS applied to B, and the residual norms for what we will call the last vector least squares (LVLS) problems involving B, and we will derive this here. It adds to our understanding, but it is not necessary for our other proofs and could initially be skipped. Because this is a theoretical tool, we will only consider rounding errors in the MGS part of the computation. We will do the analysis for MGS applied to any matrix B =[b,...,b m. After step j we have n j V j and j j R j, so that t R j j [Ūj, Ū τ j ȳ j = t j, ȳ j = arg min j y R y j, τ j = R ȳj (7.7) j.

16 ROUNDING ERROR ANALYSIS OF MGS-GMRES 79 In theory ȳ j minimizes b j B j y, but we would like to know that loss of orthogonality caused by rounding errors in MGS does not prevent this. One indicator of loss of orthogonality is Ṽ j ṽj. T From (7.7) we see that [Ū j Ū j t j τ j (7.8) R j = τ j = [ Ū j 0 [ ȳj τ j, R j e j τ j = ȳj, so that with (5.5) we have with r j b j B j ȳ j (see (7.4) and (7.5) but now using E j and its bound in (3.4) rather than Êj and its bound in (7.0)) [Ṽ (I S T j ) j ṽ j = 0 S j e j =E j R j e j =E j ȳj (7.9) τ j, r j τ j j γn ψ m (ȳ j ). Now define a normwise relative backward error (in the terminology of [3, Thm. 7.) β F (b, A, y) β A,b F (b, A, y), where βg,f F (b, A, y) b Ay (7.0). f + G F y Remark 7.. The theory in [3, Thm. 7. assumes a vector norm with its subordinate matrix norm, but with the Frobenius norm in the denominator Rigal and Gaches theory still works, so this is a possibly new, useful (and usually smaller) construct that is easier to compute than the usual one. A proof similar to that in [3, Thm. 7. shows that β G,f F (b, A, y) = min {η :(A + δa)y = b + δb, δa F η G F, δb η f }. δa,δb Using (7.0) with the bounds in (3.4), (5.6), (7.9), and the definition in (7.4) (see also (5.3)) shows that τ j. Ṽ j ṽ T j = (I S j ) E j ȳj j γn ψ m (ȳ j )/( S j ), β F (b j,b j, ȳ j ) Ṽ j ṽ T j j γ n (7.). S j Remark 7.. The product of the loss of orthogonality Ṽ j ṽj T at step j and the normwise relative backward error β F (b j,b j, ȳ j ) of the LVLS problem is bounded by O(ɛ) until S j, that is, until orthogonality of the ṽ,...,ṽ j is totally lost; see (5.5) and Lemma 5.. This is another nice result, as it again reveals how MGS applied to B m loses orthogonality at each step see the related section 5. These bounds on the individual Ṽ j ṽj T complement the bounds in (5.6), since they are essentially in terms of the individual normwise relative backward errors β F (b j,b j, ȳ j ), rather than κ F (B j ). However it is important to note that the LVLS problem considered in this section (see the line after (7.7)) is not the least squares problem solved for MGS-GMRES, which has the form of (7.6) instead. The two can give very different results in the general case, but in the problems we have solved via MGS-GMRES, these normwise relative backward errors seem to be of similar magnitudes for both problems, and this led to the conjecture in the first place. The similarity in behavior of the two problems is apparently related to the fact that B m in MGS-GMRES is a Krylov basis. In this case it appears that the normwise relative backward errors of both least squares problems will converge (numerically) as the columns of B j approach numerical linear dependence; see [7, 8. Thus we have neither proven nor disproven the conjecture, but we have added weight to it.

17 80 C. C. PAIGE, M. ROZLOŽNÍK, AND Z. STRAKOŠ 8. Numerical behavior of the MGS-GMRES algorithm. We now only consider MGS-GMRES and use k instead of m to avoid many indices of the form m. In section 4 we saw that k steps of the Arnoldi algorithm is in theory just k+ steps of the MGS algorithm applied to B k+ [b, AV k togive[b, AV k =V k+ R k+ = V k+ [e ρ, H k+,k. And in practice the only difference in the rounding error analysis is that we apply ordinary MGS to B k+ [b, fl(a V k )=[b, AṼk+[0, ΔV k ; see (4.3). In section 8. we combine this fact with the results of section 7. to prove backward stability of the MGS-GMRES least squares solution ȳ k at every step. In theory MGS-GMRES must solve Ax = b for nonsingular n nain n steps since we cannot have more than n orthonormal vectors in R n. But in practice the vectors in MGS-GMRES lose orthogonality, so we need another way to prove that we reach a solution to (.). In section 8. we will show that the MGS-GMRES algorithm for any problem satisfying (.) must, for some k, produce V k+ so that numerically b lies in the range of A V k, and that MGS-GMRES must give a backward stable solution to (.). This k is ˆm, which is n; see (6.). 8.. Backward stability of the MGS-GMRES least squares solutions. The equivalent of the MGS result (7.3) for MGS-GMRES is obtained by replacing [b, C by B k+ [b, AṼk +ΔV k throughout (7.3); see Theorem 4.. Thus the computed ȳ k at step k in MGS-GMRES satisfies (with (4.3) and section 6) (8.) ȳ k = arg min r k (y), r k (y) b+δb k (y) [AṼk+ΔV k +ΔC k (y)y y [Δb k (y), ΔC k (y)e j γ kn B k+ e j, j =,...,k+; ΔV k F k γn A F, Δb k (y) γ kn b, ΔV k +ΔC k (y) F γ kn [ A F + AṼk F γ kn A F if k< ˆm. This has proven the MGS-GMRES least squares solution ȳ k is backward stable for min b AṼky k< ˆm, y which is all we need for this least squares problem. But even if k ˆm, it is straightforward to show that it still gives a backward stable least squares solution. 8.. Backward stability of MGS-GMRES for Ax = b in (.). Even though MGS-GMRES always computes a backward stable solution ȳ k for the least squares problem (7.3), see section 8., we still have to prove that V k ȳ k will be a backward stable solution for the original system (.) for some k (we take this k to be ˆm in (6.)), and this is exceptionally difficult. Usually we want to show we have a backward stable solution when we know we have a small residual. The analysis here is different in that we will first prove that B ˆm is numerically rank deficient, see (8.4), but to prove backward stability, we will then have to prove that our residual will be small, amongst other things, and this is far from obvious. Fortunately two little known researchers have studied this arcane area, and we will take ideas from [7; see Theorem.4. To simplify the development and expressions we will absorb all small constants into the γ kn terms below. In (8.) set k ˆm n from (6.) and write (8.) r k (ȳ k )=b k A k ȳ k, b k b+δb k (ȳ k ), A k AṼk+ΔṼk(ȳ k ), Δb k (ȳ k ) γ kn b, ΔṼk(y) ΔV k +ΔC k (y), ΔṼk(y) F γ kn A F. We need to take advantage of the scaling invariance of MGS in order to obtain our results. Here we need only scale b, so write D diag(φ, I k ) for any scalar φ>0. Since

18 ROUNDING ERROR ANALYSIS OF MGS-GMRES 8 B k+ [b, fl(a V k )=[b, AṼk +ΔV k, from (8.) with the bounds in (8.) we have (8.3) [b k φ, A k = B k+ D +ΔB k D, ΔB k [Δb k (ȳ k ), ΔC k (ȳ k ), ΔB k D F γ kn B k+ D F γ kn [b k φ, A k F, B k+ D F ( γ kn ) [b k φ, A k F, b k ( + γ kn ) b. In addition, k+ is the first integer such that κ (Ṽk+) > 4/3, so section 6 gives (8.4) σ min ( B k+ D) < 8(k+)ˆγ n B k+ D F γ kn [b k φ, A k F φ>0; κ (Ṽk), σ min (Ṽk), σ max (Ṽk) 4/3; and similarly A k F AṼk F + γ kn A F (4/3+ γ kn ) A F. We can combine (8.), (8.3), and (8.4) to give under the condition in (.) (8.5) σ min (A k ) σ min (AṼk) ΔṼk(ȳ k ) 3σ min (A)/4 γ kn A F > 0, σ min ([b k φ, A k ) σ min ( B k+ D)+ ΔB k D γ kn [b k φ, A k F. The above allows us to define and analyze an important scalar, see Theorem.4, (8.6) δ k (φ) σ min([b k φ, A k ) σ min (A k ), where from (8.5) A k has full column rank. Now ȳ k and r k (ȳ k ) solve the linear least squares problem A k y b k in (8.); see (8.). If [b k,a k does not have full column rank, then r k (ȳ k )=0,so x k Ṽkȳ k is a backward stable solution for (.), which we wanted to show. Next suppose [b k,a k has full column rank. We will not seek to minimize with respect to φ the upper bound on ˆr in Theorem.4, which would be unnecessarily complicated, but instead prove that there exists a value ˆφ of φ satisfying (8.7) below, and use this value: (8.7) ˆφ >0, σ min(a k ) σ min([b k ˆφ, Ak ) = σ min(a k ) ȳ k ˆφ. Writing LHS σmin (A k) σmin ([b kφ, A k ), RHS σmin (A k) ȳ k φ we want to find φ so that LHS=RHS. But φ=0 LHS > RHS, while φ= ȳ k LHS < RHS, so from continuity ˆφ (0, ȳ k ) satisfying (8.7). With (8.6) this shows that (8.8) δ k ( ˆφ) <, ˆφ = ȳ k /[ δ k ( ˆφ), 0 < ˆφ < ȳ k. It then follows from Theorem.4 that with (8.5), (8.8), and (8.4), (8.9) r k (ȳ k ) σmin([b k ˆφ, Ak )( ˆφ + ȳ k /[ δ k ( ˆφ) ) γ kn( b k ˆφ + A k F ) ˆφ. But from (8.) and (8.) since r k (ȳ k )=b k A k ȳ k, A T k r k(ȳ k ) = 0, and from (8.8), b k ˆφ = r k (ȳ k ) ˆφ + A k ȳ k ˆφ, γ kn( b k ˆφ + A k F )+ A k ( δ k ( ˆφ) ) γ kn b k ˆφ +(+ γ kn) A k F, (8.0) b k ˆφ + γ kn γ kn A k F.

Solving large sparse Ax = b.

Solving large sparse Ax = b. Bob-05 p.1/69 Solving large sparse Ax = b. Stopping criteria, & backward stability of MGS-GMRES. Chris Paige (McGill University); Miroslav Rozložník & Zdeněk Strakoš (Academy of Sciences of the Czech Republic)..pdf

More information

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH V. FABER, J. LIESEN, AND P. TICHÝ Abstract. Numerous algorithms in numerical linear algebra are based on the reduction of a given matrix

More information

UNIFYING LEAST SQUARES, TOTAL LEAST SQUARES AND DATA LEAST SQUARES

UNIFYING LEAST SQUARES, TOTAL LEAST SQUARES AND DATA LEAST SQUARES UNIFYING LEAST SQUARES, TOTAL LEAST SQUARES AND DATA LEAST SQUARES Christopher C. Paige School of Computer Science, McGill University, Montreal, Quebec, Canada, H3A 2A7 paige@cs.mcgill.ca Zdeněk Strakoš

More information

WHEN MODIFIED GRAM-SCHMIDT GENERATES A WELL-CONDITIONED SET OF VECTORS

WHEN MODIFIED GRAM-SCHMIDT GENERATES A WELL-CONDITIONED SET OF VECTORS IMA Journal of Numerical Analysis (2002) 22, 1-8 WHEN MODIFIED GRAM-SCHMIDT GENERATES A WELL-CONDITIONED SET OF VECTORS L. Giraud and J. Langou Cerfacs, 42 Avenue Gaspard Coriolis, 31057 Toulouse Cedex

More information

On the loss of orthogonality in the Gram-Schmidt orthogonalization process

On the loss of orthogonality in the Gram-Schmidt orthogonalization process CERFACS Technical Report No. TR/PA/03/25 Luc Giraud Julien Langou Miroslav Rozložník On the loss of orthogonality in the Gram-Schmidt orthogonalization process Abstract. In this paper we study numerical

More information

14.2 QR Factorization with Column Pivoting

14.2 QR Factorization with Column Pivoting page 531 Chapter 14 Special Topics Background Material Needed Vector and Matrix Norms (Section 25) Rounding Errors in Basic Floating Point Operations (Section 33 37) Forward Elimination and Back Substitution

More information

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES 48 Arnoldi Iteration, Krylov Subspaces and GMRES We start with the problem of using a similarity transformation to convert an n n matrix A to upper Hessenberg form H, ie, A = QHQ, (30) with an appropriate

More information

The Lanczos and conjugate gradient algorithms

The Lanczos and conjugate gradient algorithms The Lanczos and conjugate gradient algorithms Gérard MEURANT October, 2008 1 The Lanczos algorithm 2 The Lanczos algorithm in finite precision 3 The nonsymmetric Lanczos algorithm 4 The Golub Kahan bidiagonalization

More information

Inexactness and flexibility in linear Krylov solvers

Inexactness and flexibility in linear Krylov solvers Inexactness and flexibility in linear Krylov solvers Luc Giraud ENSEEIHT (N7) - IRIT, Toulouse Matrix Analysis and Applications CIRM Luminy - October 15-19, 2007 in honor of Gérard Meurant for his 60 th

More information

Lanczos tridigonalization and Golub - Kahan bidiagonalization: Ideas, connections and impact

Lanczos tridigonalization and Golub - Kahan bidiagonalization: Ideas, connections and impact Lanczos tridigonalization and Golub - Kahan bidiagonalization: Ideas, connections and impact Zdeněk Strakoš Academy of Sciences and Charles University, Prague http://www.cs.cas.cz/ strakos Hong Kong, February

More information

Numerical behavior of inexact linear solvers

Numerical behavior of inexact linear solvers Numerical behavior of inexact linear solvers Miro Rozložník joint results with Zhong-zhi Bai and Pavel Jiránek Institute of Computer Science, Czech Academy of Sciences, Prague, Czech Republic The fourth

More information

Key words. conjugate gradients, normwise backward error, incremental norm estimation.

Key words. conjugate gradients, normwise backward error, incremental norm estimation. Proceedings of ALGORITMY 2016 pp. 323 332 ON ERROR ESTIMATION IN THE CONJUGATE GRADIENT METHOD: NORMWISE BACKWARD ERROR PETR TICHÝ Abstract. Using an idea of Duff and Vömel [BIT, 42 (2002), pp. 300 322

More information

A stable variant of Simpler GMRES and GCR

A stable variant of Simpler GMRES and GCR A stable variant of Simpler GMRES and GCR Miroslav Rozložník joint work with Pavel Jiránek and Martin H. Gutknecht Institute of Computer Science, Czech Academy of Sciences, Prague, Czech Republic miro@cs.cas.cz,

More information

NUMERICS OF THE GRAM-SCHMIDT ORTHOGONALIZATION PROCESS

NUMERICS OF THE GRAM-SCHMIDT ORTHOGONALIZATION PROCESS NUMERICS OF THE GRAM-SCHMIDT ORTHOGONALIZATION PROCESS Miro Rozložník Institute of Computer Science, Czech Academy of Sciences, Prague, Czech Republic email: miro@cs.cas.cz joint results with Luc Giraud,

More information

The QR Factorization

The QR Factorization The QR Factorization How to Make Matrices Nicer Radu Trîmbiţaş Babeş-Bolyai University March 11, 2009 Radu Trîmbiţaş ( Babeş-Bolyai University) The QR Factorization March 11, 2009 1 / 25 Projectors A projector

More information

On prescribing Ritz values and GMRES residual norms generated by Arnoldi processes

On prescribing Ritz values and GMRES residual norms generated by Arnoldi processes On prescribing Ritz values and GMRES residual norms generated by Arnoldi processes Jurjen Duintjer Tebbens Institute of Computer Science Academy of Sciences of the Czech Republic joint work with Gérard

More information

Notes on Eigenvalues, Singular Values and QR

Notes on Eigenvalues, Singular Values and QR Notes on Eigenvalues, Singular Values and QR Michael Overton, Numerical Computing, Spring 2017 March 30, 2017 1 Eigenvalues Everyone who has studied linear algebra knows the definition: given a square

More information

6.4 Krylov Subspaces and Conjugate Gradients

6.4 Krylov Subspaces and Conjugate Gradients 6.4 Krylov Subspaces and Conjugate Gradients Our original equation is Ax = b. The preconditioned equation is P Ax = P b. When we write P, we never intend that an inverse will be explicitly computed. P

More information

Block Bidiagonal Decomposition and Least Squares Problems

Block Bidiagonal Decomposition and Least Squares Problems Block Bidiagonal Decomposition and Least Squares Problems Åke Björck Department of Mathematics Linköping University Perspectives in Numerical Analysis, Helsinki, May 27 29, 2008 Outline Bidiagonal Decomposition

More information

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition

AM 205: lecture 8. Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition AM 205: lecture 8 Last time: Cholesky factorization, QR factorization Today: how to compute the QR factorization, the Singular Value Decomposition QR Factorization A matrix A R m n, m n, can be factorized

More information

Orthogonalization and least squares methods

Orthogonalization and least squares methods Chapter 3 Orthogonalization and least squares methods 31 QR-factorization (QR-decomposition) 311 Householder transformation Definition 311 A complex m n-matrix R = [r ij is called an upper (lower) triangular

More information

Applied Numerical Linear Algebra. Lecture 8

Applied Numerical Linear Algebra. Lecture 8 Applied Numerical Linear Algebra. Lecture 8 1/ 45 Perturbation Theory for the Least Squares Problem When A is not square, we define its condition number with respect to the 2-norm to be k 2 (A) σ max (A)/σ

More information

HOW TO MAKE SIMPLER GMRES AND GCR MORE STABLE

HOW TO MAKE SIMPLER GMRES AND GCR MORE STABLE HOW TO MAKE SIMPLER GMRES AND GCR MORE STABLE PAVEL JIRÁNEK, MIROSLAV ROZLOŽNÍK, AND MARTIN H. GUTKNECHT Abstract. In this paper we analyze the numerical behavior of several minimum residual methods, which

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 13: Conditioning of Least Squares Problems; Stability of Householder Triangularization Xiangmin Jiao Stony Brook University Xiangmin Jiao

More information

Gram-Schmidt Orthogonalization: 100 Years and More

Gram-Schmidt Orthogonalization: 100 Years and More Gram-Schmidt Orthogonalization: 100 Years and More September 12, 2008 Outline of Talk Early History (1795 1907) Middle History 1. The work of Åke Björck Least squares, Stability, Loss of orthogonality

More information

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated. Math 504, Homework 5 Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated 1 Find the eigenvalues and the associated eigenspaces

More information

PROJECTED GMRES AND ITS VARIANTS

PROJECTED GMRES AND ITS VARIANTS PROJECTED GMRES AND ITS VARIANTS Reinaldo Astudillo Brígida Molina rastudillo@kuaimare.ciens.ucv.ve bmolina@kuaimare.ciens.ucv.ve Centro de Cálculo Científico y Tecnológico (CCCT), Facultad de Ciencias,

More information

FEM and sparse linear system solving

FEM and sparse linear system solving FEM & sparse linear system solving, Lecture 9, Nov 19, 2017 1/36 Lecture 9, Nov 17, 2017: Krylov space methods http://people.inf.ethz.ch/arbenz/fem17 Peter Arbenz Computer Science Department, ETH Zürich

More information

Rounding error analysis of the classical Gram-Schmidt orthogonalization process

Rounding error analysis of the classical Gram-Schmidt orthogonalization process Cerfacs Technical report TR-PA-04-77 submitted to Numerische Mathematik manuscript No. 5271 Rounding error analysis of the classical Gram-Schmidt orthogonalization process Luc Giraud 1, Julien Langou 2,

More information

Lecture 9 Least Square Problems

Lecture 9 Least Square Problems March 26, 2018 Lecture 9 Least Square Problems Consider the least square problem Ax b (β 1,,β n ) T, where A is an n m matrix The situation where b R(A) is of particular interest: often there is a vectors

More information

Dense LU factorization and its error analysis

Dense LU factorization and its error analysis Dense LU factorization and its error analysis Laura Grigori INRIA and LJLL, UPMC February 2016 Plan Basis of floating point arithmetic and stability analysis Notation, results, proofs taken from [N.J.Higham,

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

Lecture 3: QR-Factorization

Lecture 3: QR-Factorization Lecture 3: QR-Factorization This lecture introduces the Gram Schmidt orthonormalization process and the associated QR-factorization of matrices It also outlines some applications of this factorization

More information

Least-Squares Systems and The QR factorization

Least-Squares Systems and The QR factorization Least-Squares Systems and The QR factorization Orthogonality Least-squares systems. The Gram-Schmidt and Modified Gram-Schmidt processes. The Householder QR and the Givens QR. Orthogonality The Gram-Schmidt

More information

Introduction to Numerical Linear Algebra II

Introduction to Numerical Linear Algebra II Introduction to Numerical Linear Algebra II Petros Drineas These slides were prepared by Ilse Ipsen for the 2015 Gene Golub SIAM Summer School on RandNLA 1 / 49 Overview We will cover this material in

More information

On the influence of eigenvalues on Bi-CG residual norms

On the influence of eigenvalues on Bi-CG residual norms On the influence of eigenvalues on Bi-CG residual norms Jurjen Duintjer Tebbens Institute of Computer Science Academy of Sciences of the Czech Republic duintjertebbens@cs.cas.cz Gérard Meurant 30, rue

More information

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit II: Numerical Linear Algebra. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit II: Numerical Linear Algebra Lecturer: Dr. David Knezevic Unit II: Numerical Linear Algebra Chapter II.3: QR Factorization, SVD 2 / 66 QR Factorization 3 / 66 QR Factorization

More information

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection

Last Time. Social Network Graphs Betweenness. Graph Laplacian. Girvan-Newman Algorithm. Spectral Bisection Eigenvalue Problems Last Time Social Network Graphs Betweenness Girvan-Newman Algorithm Graph Laplacian Spectral Bisection λ 2, w 2 Today Small deviation into eigenvalue problems Formulation Standard eigenvalue

More information

Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method

Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method Leslie Foster 11-5-2012 We will discuss the FOM (full orthogonalization method), CG,

More information

On the Perturbation of the Q-factor of the QR Factorization

On the Perturbation of the Q-factor of the QR Factorization NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. ; :1 6 [Version: /9/18 v1.] On the Perturbation of the Q-factor of the QR Factorization X.-W. Chang McGill University, School of Comptuer

More information

Error Bounds for Iterative Refinement in Three Precisions

Error Bounds for Iterative Refinement in Three Precisions Error Bounds for Iterative Refinement in Three Precisions Erin C. Carson, New York University Nicholas J. Higham, University of Manchester SIAM Annual Meeting Portland, Oregon July 13, 018 Hardware Support

More information

EE731 Lecture Notes: Matrix Computations for Signal Processing

EE731 Lecture Notes: Matrix Computations for Signal Processing EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University September 22, 2005 0 Preface This collection of ten

More information

Iterative Methods for Solving A x = b

Iterative Methods for Solving A x = b Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 18 Outline

More information

Charles University Faculty of Mathematics and Physics DOCTORAL THESIS. Krylov subspace approximations in linear algebraic problems

Charles University Faculty of Mathematics and Physics DOCTORAL THESIS. Krylov subspace approximations in linear algebraic problems Charles University Faculty of Mathematics and Physics DOCTORAL THESIS Iveta Hnětynková Krylov subspace approximations in linear algebraic problems Department of Numerical Mathematics Supervisor: Doc. RNDr.

More information

Gaussian Elimination for Linear Systems

Gaussian Elimination for Linear Systems Gaussian Elimination for Linear Systems Tsung-Ming Huang Department of Mathematics National Taiwan Normal University October 3, 2011 1/56 Outline 1 Elementary matrices 2 LR-factorization 3 Gaussian elimination

More information

Analysis of Block LDL T Factorizations for Symmetric Indefinite Matrices

Analysis of Block LDL T Factorizations for Symmetric Indefinite Matrices Analysis of Block LDL T Factorizations for Symmetric Indefinite Matrices Haw-ren Fang August 24, 2007 Abstract We consider the block LDL T factorizations for symmetric indefinite matrices in the form LBL

More information

ETNA Kent State University

ETNA Kent State University Electronic Transactions on Numerical Analysis. Volume 1, pp. 1-11, 8. Copyright 8,. ISSN 168-961. MAJORIZATION BOUNDS FOR RITZ VALUES OF HERMITIAN MATRICES CHRISTOPHER C. PAIGE AND IVO PANAYOTOV Abstract.

More information

Linear Algebra March 16, 2019

Linear Algebra March 16, 2019 Linear Algebra March 16, 2019 2 Contents 0.1 Notation................................ 4 1 Systems of linear equations, and matrices 5 1.1 Systems of linear equations..................... 5 1.2 Augmented

More information

A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Squares Problem

A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Squares Problem A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Suares Problem Hongguo Xu Dedicated to Professor Erxiong Jiang on the occasion of his 7th birthday. Abstract We present

More information

Page 52. Lecture 3: Inner Product Spaces Dual Spaces, Dirac Notation, and Adjoints Date Revised: 2008/10/03 Date Given: 2008/10/03

Page 52. Lecture 3: Inner Product Spaces Dual Spaces, Dirac Notation, and Adjoints Date Revised: 2008/10/03 Date Given: 2008/10/03 Page 5 Lecture : Inner Product Spaces Dual Spaces, Dirac Notation, and Adjoints Date Revised: 008/10/0 Date Given: 008/10/0 Inner Product Spaces: Definitions Section. Mathematical Preliminaries: Inner

More information

Iterative Methods for Sparse Linear Systems

Iterative Methods for Sparse Linear Systems Iterative Methods for Sparse Linear Systems Luca Bergamaschi e-mail: berga@dmsa.unipd.it - http://www.dmsa.unipd.it/ berga Department of Mathematical Methods and Models for Scientific Applications University

More information

arxiv: v1 [math.na] 5 May 2011

arxiv: v1 [math.na] 5 May 2011 ITERATIVE METHODS FOR COMPUTING EIGENVALUES AND EIGENVECTORS MAYSUM PANJU arxiv:1105.1185v1 [math.na] 5 May 2011 Abstract. We examine some numerical iterative methods for computing the eigenvalues and

More information

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A. AMSC/CMSC 661 Scientific Computing II Spring 2005 Solution of Sparse Linear Systems Part 2: Iterative methods Dianne P. O Leary c 2005 Solving Sparse Linear Systems: Iterative methods The plan: Iterative

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

Krylov Subspace Methods that Are Based on the Minimization of the Residual

Krylov Subspace Methods that Are Based on the Minimization of the Residual Chapter 5 Krylov Subspace Methods that Are Based on the Minimization of the Residual Remark 51 Goal he goal of these methods consists in determining x k x 0 +K k r 0,A such that the corresponding Euclidean

More information

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294) Conjugate gradient method Descent method Hestenes, Stiefel 1952 For A N N SPD In exact arithmetic, solves in N steps In real arithmetic No guaranteed stopping Often converges in many fewer than N steps

More information

Numerical Methods in Matrix Computations

Numerical Methods in Matrix Computations Ake Bjorck Numerical Methods in Matrix Computations Springer Contents 1 Direct Methods for Linear Systems 1 1.1 Elements of Matrix Theory 1 1.1.1 Matrix Algebra 2 1.1.2 Vector Spaces 6 1.1.3 Submatrices

More information

For δa E, this motivates the definition of the Bauer-Skeel condition number ([2], [3], [14], [15])

For δa E, this motivates the definition of the Bauer-Skeel condition number ([2], [3], [14], [15]) LAA 278, pp.2-32, 998 STRUCTURED PERTURBATIONS AND SYMMETRIC MATRICES SIEGFRIED M. RUMP Abstract. For a given n by n matrix the ratio between the componentwise distance to the nearest singular matrix and

More information

Jim Lambers MAT 610 Summer Session Lecture 2 Notes

Jim Lambers MAT 610 Summer Session Lecture 2 Notes Jim Lambers MAT 610 Summer Session 2009-10 Lecture 2 Notes These notes correspond to Sections 2.2-2.4 in the text. Vector Norms Given vectors x and y of length one, which are simply scalars x and y, the

More information

Numerical Methods I Solving Square Linear Systems: GEM and LU factorization

Numerical Methods I Solving Square Linear Systems: GEM and LU factorization Numerical Methods I Solving Square Linear Systems: GEM and LU factorization Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 September 18th,

More information

Fundamentals of Engineering Analysis (650163)

Fundamentals of Engineering Analysis (650163) Philadelphia University Faculty of Engineering Communications and Electronics Engineering Fundamentals of Engineering Analysis (6563) Part Dr. Omar R Daoud Matrices: Introduction DEFINITION A matrix is

More information

Principles and Analysis of Krylov Subspace Methods

Principles and Analysis of Krylov Subspace Methods Principles and Analysis of Krylov Subspace Methods Zdeněk Strakoš Institute of Computer Science, Academy of Sciences, Prague www.cs.cas.cz/~strakos Ostrava, February 2005 1 With special thanks to C.C.

More information

BlockMatrixComputations and the Singular Value Decomposition. ATaleofTwoIdeas

BlockMatrixComputations and the Singular Value Decomposition. ATaleofTwoIdeas BlockMatrixComputations and the Singular Value Decomposition ATaleofTwoIdeas Charles F. Van Loan Department of Computer Science Cornell University Supported in part by the NSF contract CCR-9901988. Block

More information

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA

SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS. Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA 1 SOLVING SPARSE LINEAR SYSTEMS OF EQUATIONS Chao Yang Computational Research Division Lawrence Berkeley National Laboratory Berkeley, CA, USA 2 OUTLINE Sparse matrix storage format Basic factorization

More information

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit V: Eigenvalue Problems Lecturer: Dr. David Knezevic Unit V: Eigenvalue Problems Chapter V.4: Krylov Subspace Methods 2 / 51 Krylov Subspace Methods In this chapter we give

More information

We first repeat some well known facts about condition numbers for normwise and componentwise perturbations. Consider the matrix

We first repeat some well known facts about condition numbers for normwise and componentwise perturbations. Consider the matrix BIT 39(1), pp. 143 151, 1999 ILL-CONDITIONEDNESS NEEDS NOT BE COMPONENTWISE NEAR TO ILL-POSEDNESS FOR LEAST SQUARES PROBLEMS SIEGFRIED M. RUMP Abstract. The condition number of a problem measures the sensitivity

More information

Solving large scale eigenvalue problems

Solving large scale eigenvalue problems arge scale eigenvalue problems, Lecture 4, March 14, 2018 1/41 Lecture 4, March 14, 2018: The QR algorithm http://people.inf.ethz.ch/arbenz/ewp/ Peter Arbenz Computer Science Department, ETH Zürich E-mail:

More information

Multiplicative Perturbation Analysis for QR Factorizations

Multiplicative Perturbation Analysis for QR Factorizations Multiplicative Perturbation Analysis for QR Factorizations Xiao-Wen Chang Ren-Cang Li Technical Report 011-01 http://www.uta.edu/math/preprint/ Multiplicative Perturbation Analysis for QR Factorizations

More information

Lecture 6. Numerical methods. Approximation of functions

Lecture 6. Numerical methods. Approximation of functions Lecture 6 Numerical methods Approximation of functions Lecture 6 OUTLINE 1. Approximation and interpolation 2. Least-square method basis functions design matrix residual weighted least squares normal equation

More information

c 2008 Society for Industrial and Applied Mathematics

c 2008 Society for Industrial and Applied Mathematics SIAM J. MATRIX ANAL. APPL. Vol. 30, No. 4, pp. 1483 1499 c 2008 Society for Industrial and Applied Mathematics HOW TO MAKE SIMPLER GMRES AND GCR MORE STABLE PAVEL JIRÁNEK, MIROSLAV ROZLOŽNÍK, AND MARTIN

More information

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems Topics The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems What about non-spd systems? Methods requiring small history Methods requiring large history Summary of solvers 1 / 52 Conjugate

More information

Numerical Analysis Lecture Notes

Numerical Analysis Lecture Notes Numerical Analysis Lecture Notes Peter J Olver 8 Numerical Computation of Eigenvalues In this part, we discuss some practical methods for computing eigenvalues and eigenvectors of matrices Needless to

More information

This can be accomplished by left matrix multiplication as follows: I

This can be accomplished by left matrix multiplication as follows: I 1 Numerical Linear Algebra 11 The LU Factorization Recall from linear algebra that Gaussian elimination is a method for solving linear systems of the form Ax = b, where A R m n and bran(a) In this method

More information

Linear Algebra. The analysis of many models in the social sciences reduces to the study of systems of equations.

Linear Algebra. The analysis of many models in the social sciences reduces to the study of systems of equations. POLI 7 - Mathematical and Statistical Foundations Prof S Saiegh Fall Lecture Notes - Class 4 October 4, Linear Algebra The analysis of many models in the social sciences reduces to the study of systems

More information

GMRES ON (NEARLY) SINGULAR SYSTEMS

GMRES ON (NEARLY) SINGULAR SYSTEMS SIAM J. MATRIX ANAL. APPL. c 1997 Society for Industrial and Applied Mathematics Vol. 18, No. 1, pp. 37 51, January 1997 004 GMRES ON (NEARLY) SINGULAR SYSTEMS PETER N. BROWN AND HOMER F. WALKER Abstract.

More information

EIGENVALUE PROBLEMS. EIGENVALUE PROBLEMS p. 1/4

EIGENVALUE PROBLEMS. EIGENVALUE PROBLEMS p. 1/4 EIGENVALUE PROBLEMS EIGENVALUE PROBLEMS p. 1/4 EIGENVALUE PROBLEMS p. 2/4 Eigenvalues and eigenvectors Let A C n n. Suppose Ax = λx, x 0, then x is a (right) eigenvector of A, corresponding to the eigenvalue

More information

A Residual Inverse Power Method

A Residual Inverse Power Method University of Maryland Institute for Advanced Computer Studies Department of Computer Science College Park TR 2007 09 TR 4854 A Residual Inverse Power Method G. W. Stewart February 2007 ABSTRACT The inverse

More information

Linear Algebra, part 3 QR and SVD

Linear Algebra, part 3 QR and SVD Linear Algebra, part 3 QR and SVD Anna-Karin Tornberg Mathematical Models, Analysis and Simulation Fall semester, 2012 Going back to least squares (Section 1.4 from Strang, now also see section 5.2). We

More information

1 Error analysis for linear systems

1 Error analysis for linear systems Notes for 2016-09-16 1 Error analysis for linear systems We now discuss the sensitivity of linear systems to perturbations. This is relevant for two reasons: 1. Our standard recipe for getting an error

More information

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 9

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 9 CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 9 GENE H GOLUB 1 Error Analysis of Gaussian Elimination In this section, we will consider the case of Gaussian elimination and perform a detailed

More information

Iterative methods for Linear System

Iterative methods for Linear System Iterative methods for Linear System JASS 2009 Student: Rishi Patil Advisor: Prof. Thomas Huckle Outline Basics: Matrices and their properties Eigenvalues, Condition Number Iterative Methods Direct and

More information

Numerical Methods. Elena loli Piccolomini. Civil Engeneering. piccolom. Metodi Numerici M p. 1/??

Numerical Methods. Elena loli Piccolomini. Civil Engeneering.  piccolom. Metodi Numerici M p. 1/?? Metodi Numerici M p. 1/?? Numerical Methods Elena loli Piccolomini Civil Engeneering http://www.dm.unibo.it/ piccolom elena.loli@unibo.it Metodi Numerici M p. 2/?? Least Squares Data Fitting Measurement

More information

Krylov subspace projection methods

Krylov subspace projection methods I.1.(a) Krylov subspace projection methods Orthogonal projection technique : framework Let A be an n n complex matrix and K be an m-dimensional subspace of C n. An orthogonal projection technique seeks

More information

Class notes: Approximation

Class notes: Approximation Class notes: Approximation Introduction Vector spaces, linear independence, subspace The goal of Numerical Analysis is to compute approximations We want to approximate eg numbers in R or C vectors in R

More information

Algorithms that use the Arnoldi Basis

Algorithms that use the Arnoldi Basis AMSC 600 /CMSC 760 Advanced Linear Numerical Analysis Fall 2007 Arnoldi Methods Dianne P. O Leary c 2006, 2007 Algorithms that use the Arnoldi Basis Reference: Chapter 6 of Saad The Arnoldi Basis How to

More information

Backward perturbation analysis for scaled total least-squares problems

Backward perturbation analysis for scaled total least-squares problems NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. 009; 16:67 648 Published online 5 March 009 in Wiley InterScience (www.interscience.wiley.com)..640 Backward perturbation analysis

More information

ON THE GLOBAL KRYLOV SUBSPACE METHODS FOR SOLVING GENERAL COUPLED MATRIX EQUATIONS

ON THE GLOBAL KRYLOV SUBSPACE METHODS FOR SOLVING GENERAL COUPLED MATRIX EQUATIONS ON THE GLOBAL KRYLOV SUBSPACE METHODS FOR SOLVING GENERAL COUPLED MATRIX EQUATIONS Fatemeh Panjeh Ali Beik and Davod Khojasteh Salkuyeh, Department of Mathematics, Vali-e-Asr University of Rafsanjan, Rafsanjan,

More information

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u

be a Householder matrix. Then prove the followings H = I 2 uut Hu = (I 2 uu u T u )u = u 2 uut u MATH 434/534 Theoretical Assignment 7 Solution Chapter 7 (71) Let H = I 2uuT Hu = u (ii) Hv = v if = 0 be a Householder matrix Then prove the followings H = I 2 uut Hu = (I 2 uu )u = u 2 uut u = u 2u =

More information

Krylov Subspaces. Lab 1. The Arnoldi Iteration

Krylov Subspaces. Lab 1. The Arnoldi Iteration Lab 1 Krylov Subspaces Lab Objective: Discuss simple Krylov Subspace Methods for finding eigenvalues and show some interesting applications. One of the biggest difficulties in computational linear algebra

More information

LU Factorization. LU factorization is the most common way of solving linear systems! Ax = b LUx = b

LU Factorization. LU factorization is the most common way of solving linear systems! Ax = b LUx = b AM 205: lecture 7 Last time: LU factorization Today s lecture: Cholesky factorization, timing, QR factorization Reminder: assignment 1 due at 5 PM on Friday September 22 LU Factorization LU factorization

More information

Stability of the Gram-Schmidt process

Stability of the Gram-Schmidt process Stability of the Gram-Schmidt process Orthogonal projection We learned in multivariable calculus (or physics or elementary linear algebra) that if q is a unit vector and v is any vector then the orthogonal

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences) Lecture 1: Course Overview; Matrix Multiplication Xiangmin Jiao Stony Brook University Xiangmin Jiao Numerical

More information

Some Notes on Least Squares, QR-factorization, SVD and Fitting

Some Notes on Least Squares, QR-factorization, SVD and Fitting Department of Engineering Sciences and Mathematics January 3, 013 Ove Edlund C000M - Numerical Analysis Some Notes on Least Squares, QR-factorization, SVD and Fitting Contents 1 Introduction 1 The Least

More information

M.A. Botchev. September 5, 2014

M.A. Botchev. September 5, 2014 Rome-Moscow school of Matrix Methods and Applied Linear Algebra 2014 A short introduction to Krylov subspaces for linear systems, matrix functions and inexact Newton methods. Plan and exercises. M.A. Botchev

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra The two principal problems in linear algebra are: Linear system Given an n n matrix A and an n-vector b, determine x IR n such that A x = b Eigenvalue problem Given an n n matrix

More information

Scientific Computing

Scientific Computing Scientific Computing Direct solution methods Martin van Gijzen Delft University of Technology October 3, 2018 1 Program October 3 Matrix norms LU decomposition Basic algorithm Cost Stability Pivoting Pivoting

More information

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9

STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9 STAT 309: MATHEMATICAL COMPUTATIONS I FALL 2018 LECTURE 9 1. qr and complete orthogonal factorization poor man s svd can solve many problems on the svd list using either of these factorizations but they

More information

Sparse least squares and Q-less QR

Sparse least squares and Q-less QR Notes for 2016-02-29 Sparse least squares and Q-less QR Suppose we want to solve a full-rank least squares problem in which A is large and sparse. In principle, we could solve the problem via the normal

More information

Applied Linear Algebra in Geoscience Using MATLAB

Applied Linear Algebra in Geoscience Using MATLAB Applied Linear Algebra in Geoscience Using MATLAB Contents Getting Started Creating Arrays Mathematical Operations with Arrays Using Script Files and Managing Data Two-Dimensional Plots Programming in

More information