CG VERSUS MINRES: AN EMPIRICAL COMPARISON

Size: px
Start display at page:

Download "CG VERSUS MINRES: AN EMPIRICAL COMPARISON"

Transcription

1 VERSUS : AN EMPIRICAL COMPARISON DAVID CHIN-LUNG FONG AND MICHAEL SAUNDERS Abstract. For iterative solution of symmetric systems Ax = b, the conjugate gradient method () is commonly used when A is positive definite, while the minimum residual method () is typically reserved for indefinite systems. We investigate the sequence of approximate solutions x k generated by each method and suggest that even if A is positive definite, may be preferable to if iterations are to be terminated early. In particular, we show for that the solution norms x k are monotonically increasing when A is positive definite (as was already known for ), and the solution errors x x k are monotonically decreasing. We also show that the backward errors for the iterates x k are monotonically decreasing. Key words. conjugate gradient method, minimum residual method, iterative method, sparse matrix, linear equations,, CR,, Krylov subspace method, trust-region method. Introduction. The conjugate gradient method () [] and the minimum residual method () [8] are both Krylov subspace methods for the iterative solution of symmetric linear equations Ax = b. is commonly used when the matrix A is positive definite, while is generally reserved for indefinite systems [7, p85]. We reexamine this wisdom from the point of view of early termination on positive-definite systems. We assume that the system Ax = b is real with A symmetric positive definite (spd) and of dimension n n. The Lanczos process [3] with starting vector b may be used to generate the n k matrix V k ( ) v v... v k and the (k + ) k Hessenberg tridiagonal matrix T k such that AV k = V k+ T k for k =,,..., l and AV l = V l T l for some l n, where the columns of V k form a theoretically orthonormal basis for the kth Krylov subspace K k (A, b) span{b, Ab, A b,..., A k b}, and T l is l l and tridiagonal. Approximate solutions within the kth Krylov subspace may be formed as x k = V k y k for some k-vector y k. As shown in [8], three iterative methods,, and SYMMLQ may be derived by choosing y k appropriately at each iteration. is well defined if A is spd, while and SYMMLQ are stable for any symmetric nonsingular A. As noted by Choi [], SYMMLQ can form an approximation x k+ = V k+ y k+ in the (k+)th Krylov subspace when and are forming their approximations x k = V k y k in the kth subspace. It would be of future interest to compare all three methods on spd systems, but for the remainder of this paper we focus on and. With different methods using the same information V k+ and T k to compute solution estimates x k = V k y k within the same Krylov subspace (for each k), it is commonly thought that the number of iterations required will be similar for each method, and hence should be preferable on spd systems because it requires somewhat fewer floating-point operations per iteration. This view is justified if an accurate solution is required (stopping tolerance τ close to machine precision ɛ). We show that with Report SOL -R (to appear in SQU Journal for Science, 7: (), 44 6). ICME, Stanford University, CA , USA (david3.459@gmail.com). Partially supported by a Stanford Graduate Fellowship. Systems Optimization Laboratory, Department of Management Science and Engineering, Stanford University, CA , USA (saunders@stanford.edu). Partially supported by Office of Naval Research grant N and NSF grant DMS-95, and by the U.S. Army Research Laboratory, through the Army High Performance Computing Research Center, Cooperative Agreement W9NF-7-7.

2 DAVID FONG AND MICHAEL SAUNDERS looser stopping tolerances, is sure to terminate sooner than when the stopping rule is based on the backward error for x k, and by numerical examples we illustrate that the difference in iteration numbers can be substantial... Notation. We study the application of and to real symmetric positive-definite (spd) systems Ax = b. The unique solution is denoted by x. The initial approximate solution is x, and r k b Ax k is the residual vector for an approximation x k within the kth Krylov subspace. For a vector v and matrix A, v and A denote the -norm and the Frobenius norm respectively, and A indicates that A is spd.. Minimization properties of Krylov subspace methods. With exact arithmetic, the Lanczos process terminates with k = l for some l n. To ensure that the approximations x k = V k y k improve by some measure as k increases toward l, the Krylov solvers minimize some convex function within the expanding Krylov subspaces [9].... When A is spd, the quadratic form φ(x) xt Ax b T x is bounded below, and its unique minimizer solves Ax = b. A characterization of the iterations is that they minimize the quadratic form within each Krylov subspace [9], [7,.4], [8, ]: x C k = V k y C k, where y C k = arg min y φ(v k y). With b = Ax and φ(x k ) = x T k Ax k x T k Ax, this is equivalent to minimizing the function x x k A (x x k ) T A(x x k ), known as the energy norm of the error, within each Krylov subspace. For some applications, this is a desirable property [, 4,, 7, 8].... For nonsingular (and possibly indefinite) systems, the residual norm was used in [8] to characterize the iterations: x M k = V k y M k, where y M k = arg min y b AV k y. (.) Thus, minimizes r k within the kth Krylov subspace. This was also an aim of Stiefel s Conjugate Residual method (CR) [3] for spd systems (and of Luenberger s extensions of CR to indefinite systems [5, 6]). Thus, CR and must generate the same iterates on spd systems. We use this connection to prove that x k increases monotonically when is applied to an spd system..3. and CR. The two methods for solving spd systems Ax = b are summarized in Table.. The first two columns are pseudocodes for and CR with iteration number k omitted for clarity; they match our Matlab implementations. Note that q = Ap in both methods, but it is not computed as such in CR. Termination occurs when r = ( ρ = β = ). To prove our main result we need to introduce iteration indices; see column 3 of Table.. Termination occurs when r k = for some index k = l n ( ρ l = β l =, r l = s l = p l = q l = ). Note: This l is the same as the l at which the Lanczos process theoretically terminates for the given A and b. Theorem.. The following properties hold for Algorithm CR: (a) q T i q j = (i j) (b) r T i q j = (i j + )

3 VERSUS 3 Table. Pseudocode for algorithms and CR Initialize x =, r = b ρ = r, p = r Repeat q = Ap α = ρ/p T q x x + αp r r αq ρ = ρ, ρ = r T r β = ρ/ ρ p r + βp CR Initialize x =, r = b, s = Ar, ρ = r T s, p = r, q = s Repeat (q = Ap) α = ρ/ q x x + αp r r αq s = Ar ρ = ρ, ρ = r T s β = ρ/ ρ p r + βp q s + βq CR Initialize x =, r = b, s = Ar, ρ = r T s, p = r, q = s For k =,,... (q k = Ap k ) α k = ρ k / q k x k = x k + α k p k r k = r k α k q k s k = Ar k ρ k = r T k s k β k = ρ k /ρ k p k = r k + β k p k q k = s k + β k q k Proof. Given in [6, Theorem ]. Theorem.. The following properties hold for Algorithm CR: (a) α i (b) β i (c) p T i q j (d) p T i p j (e) x T i p j (f) r T i p j Proof. (a) Here we use the fact that A is spd. The inequalities are strict until i = l (and r l = ). (b) And again: (c) Case I: i = j Case II: i j = k > ρ i = r T i s i = r T i Ar i (A ) (.) α i = ρ i / q i β i = ρ i /ρ i (by (.)) p T i q i = p T i Ap i (A ) p T i q j = p T i q i k = ri T q i k + β i p T iq i k = β i p T iq i k (by Thm. (b)), where β i by (b) and p T i q i k by induction as (i) (i k) = k < k.

4 4 DAVID FONG AND MICHAEL SAUNDERS Case III: j i = k > p T i q j = p T i q i+k = p T i Ap i+k = p T i A(r i+k + β i+k p i+k ) = q T i r i+k + β i+k p T i q i+k = β i+k p T i q i+k (by Thm. (b)), where β i+k by (b) and p T i q i+k by induction as (i+k) i = k < k. (d) At termination, define P span{p, p,..., p l } and Q span{q,..., q l }. By construction, P = span{b, Ab,..., A l b} and Q = span{ab,..., A l b} (since q i = Ap i ). Again by construction, x l P, and since r l = we have b = Ax l b Q. We see that P Q. By Theorem.(a), {q i / q i } l i= forms an orthonormal basis for Q. If we project p i P Q onto this basis, we have p i = l k= p T i q k qk T q q k, k where all coordinates are non-negative from (c). Similarly for any other p j, j < l. Therefore p T i p j for any i, j < l. (e) By construction, x i = x i + α i p i = = i α k p k (x = ) k= Therefore x T i p i by (d) and (a). (f) Note that any r i can be expressed as a sum of q i : Thus we have r i = r i+ + α i+ q i = = r l + α l q l + + α i+ q i = α l q l + + α i+ q i. r T i p j = (α l q l + + α i+ q i ) T p j, where the inequality follows from (a) and (c). We are now able to prove our main theorem about the monotonic increase of x k for CR and. A similar result was proved for by Steihaug []. Theorem.3. For CR (and hence ) on an spd system Ax = b, x k increases monotonically. Proof. x i x i = α i x T i p i + p T i p i, where the last inequality follows from Theorem. (a), (d) and (e). Therefore x i x i. x x k is known to be monotonic for []. The corresponding result for CR is a direct consequence of [, Thm 7:5]. However, the second half of that theorem,

5 VERSUS 5 x x C k > x x M k, rarely holds in machine arithmetic. We give here an alternative proof that does not depend on. Theorem.4. For CR (and hence ) on an spd system Ax = b, the error x x k decreases monotonically. Proof. From the update rule for x k, we can express the final solution x l = x as x l = x l + α l p l = = x k + α k+ p k + + α l p l (.3) = x k + α k p k + α k+ p k + + α l p l. (.4) Using the last two equalities above, we can write x l x k x l x k = (x l x k ) T (x l x k ) (x l x k ) T (x l x k ) = α k p T k(α k+ p k + + α l p l ) + α kp T kp k, where the last inequality follows from Theorem. (a), (d). The energy norm error x x k A is monotonic for by design. The corresponding result for CR is given in [, Thm 7:4]. We give an alternative proof here. Theorem.5. For CR (and hence ) on an spd system Ax = b, the error in energy norm x x k A is strictly decreasing. Proof. From (.3) and (.4), we can write x l x k A x l x k A = (x l x k ) T A(x l x k ) (x l x k ) T A(x l x k ) = α k p T ka(α k+ p k + + α l p l ) + α kp T kap k = α k q T k(α k+ p k + + α l p l ) + α kq T kp k >, where the last inequality follows from Theorem. (a), (c). 3. Backward error analysis. For many physical problems requiring numerical solution, we are given inexact or uncertain input data (in this case A and/or b). It is not justifiable to seek a solution beyond the accuracy of the data [6]. Instead, it is more reasonable to stop an iterative solver once we know that the current approximate solution solves a nearby problem. The measure of nearby should match the error in the input data. The design of such stopping rules is an important application of backward error analysis. For a consistent linear system Ax = b, we think of x k coming from the kth iteration of one of the iterative solvers. Following Titley-Péloquin [5] we say that x k is an acceptable solution if and only if there exist perturbations E and f satisfying (A + E)x k = b + f, E A α, f b β (3.) for some tolerances α, β that reflect the (preferably known) accuracy of the data. We are naturally interested in minimizing the size of E and f. If we define the optimization problem min ξ s.t. (A + E)x k = b + f, ξ,e,f E A αξ, f b βξ

6 6 DAVID FONG AND MICHAEL SAUNDERS to have optimal solution ξ k, E k, f k (all functions of x k, α, and β), we see that x k is an acceptable solution if and only if ξ k. We call ξ k the normwise relative backward error (NRBE) for x k. With r k = b Ax k, the optimal solution ξ k, E k, f k is shown in [5] to be β b φ k = α A x k + β b, E k = ( φ k) x k r kx T k, (3.) r k ξ k = α A x k + β b, f k = φ k r k. (3.3) (See [, p] for the case β = and [, 7. and p336] for the case α = β.) 3.. Stopping rule. For general tolerances α and β, the condition ξ k for x k to be an acceptable solution becomes r k α A x k + β b, (3.4) the stopping rule used in LSQR for consistent systems [9, p54, rule S]. 3.. Monotonic backward errors. Of interest is the size of the perturbations to A and b for which x k is an exact solution of Ax = b. From (3.) (3.3), the perturbations have the following norms: E k = ( φ k ) r k x k = f k = φ k r k = α A r k α A x k + β b, (3.5) β b r k α A x k + β b. (3.6) Since x k is monotonically increasing for and, we see from (3.) that φ k is monotonically decreasing for both solvers. Since r k is monotonically decreasing for (but not for ), we have the following result. Theorem 3.. Suppose α > and β > in (3.). For CR and (but not ), the relative backward errors E k / A and f k / b decrease monotonically. Proof. This follows from (3.5) (3.6) with x k increasing for both solvers and r k decreasing for CR and but not for. 4. Numerical results. Here we compare the convergence of and on various spd systems Ax = b and some associated indefinite systems (A δi)x = b. The test examples are drawn from the University of Florida Sparse Matrix Collection (Davis [5]). We experimented with all 6 cases for which A is real spd and b is supplied. In Matlab we computed the condition number for each test matrix by finding the largest and smallest eigenvalue using eigs(a,, LM ) and eigs(a,, SM ) respectively. For this test set, the condition numbers range from.7e+3 to 3.E+3. Since A is spd, we apply diagonal preconditioning by redefining A and b as follows: d = diag(a), D = diag(./sqrt(d)), A DAD, b Db, b b/ b. Thus in the figures below we have diag(a) = I and b =. With this preconditioning, the condition numbers range from.e+ to.e+. The distribution of condition number of the test set matrices before and after preconditioning is shown in Figure 4.. The stopping rule used for and was (3.4) with α = and β =. That is, r k b = (but with a maximum of 5n iterations for spd systems and n iterations for indefinite systems).

7 VERSUS 7 4 vs test set Cond(A) original preconditioned Matrix ID Fig. 4.. Distribution of condition number for matrices used for vs comparison, before and after diagonal preconditioning 4.. Positive-definite systems. In plotting backward errors, we assume for simplicity that α > and β = in (3.) (3.3), even though it doesn t match the choice of α and β in the stopping rule (3.4). This gives φ k = and E k = r k / x k in (3.5). Thus, as in Theorem 3., we expect E k to decrease monotonically for CR and but not for. In Figures 4. and 4.3, we plot r k / x k, x x k A, and x x k for and for four different problems. For, the plots confirm that x x k A and x x k are monotonic. For, the plots confirm the prediction of Theorems 3.,.5, and.4 that r k / x k, x x k A, and x x k are monotonic. Figure 4. (left) shows problem Schenk AFE af shell8 with A of size and cond(a) =.7E+5. From the plot of backward errors r k / x k, we see that both and converge quickly at the early iterations. Then the backward error of plateaus at about iteration 8, and the backward error of stays about order of magnitude behind. A similar phenomenon of fast convergence at early iterations followed by slow convergence is observed in the energy norm error and -norm error plots. Figure 4. (right) shows problem Cannizzo sts498 with A of size and cond(a) = 6.7E+3. converges slightly faster in terms of backward error, while converges slightly faster in terms of both error norms.

8 8 DAVID FONG AND MICHAEL SAUNDERS Name:Schenk_AFE_af_shell8, Dim:54855x54855, nnz:757955, id= Name:Cannizzo_sts498, Dim:498x498, nnz:7356, id=3 log( r / x ) log( r / x ) Schenk_AFE_af_shell8, Dim:54855x54855, nnz:757955, id= Cannizzo_sts498, Dim:498x498, nnz:7356, id=3 log x k x * A log x k x * A Schenk_AFE_af_shell8, Dim:54855x54855, nnz:757955, id= Cannizzo_sts498, Dim:498x498, nnz:7356, id=3 log x k x * log x k x * Fig. 4.. Comparison of backward and forward errors for and solving two spd systems Ax = b. Left: Problem Schenk AFE af shell8 with n = and cond(a) =.7E+5. Note that stops significantly sooner than with α = and β = in (3.4). Right: Cannizzo sts498 with n = 498 and cond(a) = 6.7E+3. stops slightly sooner than. Top: The values of log ( r k / x k ) are plotted against iteration number k. These values define log ( E k ) when the stopping tolerances in (3.4) are α > and β =. Middle: The values of log x x k A are plotted against iteration number k. This is the quantity that minimizes at each iteration. Bottom: The values of log x x k.

9 VERSUS 9 Name:Simon_raefsky4, Dim:9779x9779, nnz:36789, id=7 Name:BenElechi_BenElechi, Dim:45874x45874, nnz:35496, id= log( r / x ) log( r / x ) x x 4 5 Simon_raefsky4, Dim:9779x9779, nnz:36789, id=7 BenElechi_BenElechi, Dim:45874x45874, nnz:35496, id= 4 3 log x k x * A log x k x * A x x 4 Simon_raefsky4, Dim:9779x9779, nnz:36789, id=7 BenElechi_BenElechi, Dim:45874x45874, nnz:35496, id= 9 8 log x k x * 7 log x k x * x x 4 Fig Comparison of backward and forward errors for and solving two more spd systems Ax = b. Left: Problem Simon raefsky4, with n = 9779 and cond(a) =.E+. Right: BenElechi BenElechi, with n = and cond(a) =.8E+9. Top: The values of log ( r k / x k ) are plotted against iteration number k. These values define log ( E k ) when the stopping tolerances in (3.4) are α > and β =. Middle: The values of log x x k A are plotted against iteration number k. This is the quantity that minimizes at each iteration. Bottom: The values of log x x k.

10 DAVID FONG AND MICHAEL SAUNDERS Figure 4.3 (left) shows problem Simon raefsky4 with A of size and cond(a) =.E+. Because of the high condition number, both algorithms hit the 5n iteration limit. We see that the backward error for converges faster than for as expected. For the energy norm error, is able to decrease over 5 orders of magnitude while plateaus after a orders of magnitude decrease. For both the energy norm error and -norm error, reaches a lower point than for some iterations. This must be due to numerical error in and (a result of loss of orthogonality in V k ). Figure 4.3 (right) shows problem BenElechi BenElechi with A of size and cond(a) =.8E+9. The backward error of stays ahead of by orders of magnitude for most iterations. Around iteration 3, the backward error of both algorithms goes down rapidly and catches up with. Both algorithms exhibit a plateau on energy norm error for the first iterations. The error norms for start decreasing around iteration and decrease even faster after iteration 3. Figure 4.4 shows r k and x k for and on two typical spd examples. We see that x k is monotonically increasing for both solvers, and the x k values rise fairly rapidly to their limiting value x, with a moderate delay for. Figure 4.5 shows r k and x k for and on two spd examples in which the residual decrease and the solution norm increase are somewhat slower than typical. The rise of x k for is rather more delayed. In the second case, if the stopping tolerance were β = rather than β =, the final x k (k ) would be less than half the exact value x. It will be of future interest to evaluate this effect within the context of trust-region methods for optimization Why does r k for lag behind?. It is commonly thought that even though is known to minimize r k at each iteration, the cumulative minimum of r k for should approximately match that of. That is, min i k rc i rk M. However, in Figures 4. and 4.3 we see that r k for is often smaller than for by or orders of magnitude. This phenomenon can be explained by the following relations between rk C and rm k [, Lemma 5.4.] and [6]: r C k = r M k r Mk / r Mk. (4.) From (4.), one can infer that if rk M decreases a lot between iterations k and k, then rk C would be roughly the same as rm k. The converse also holds, in that rk C will be much larger than rm k if is almost stalling at iteration k (i.e., rk M did not decrease much relative to the previous iteration). The above analysis was pointed out by Titley-Peloquin [6] in comparing LSQR and LSMR [8]. We repeat the analysis here for vs and extend it to demonstrate why there is a lag in general for large problems. With α = in stopping rule (3.4), and stop when r k β b. If this occurs at iteration l, we have l k= r k r k = r l b β.

11 VERSUS Name:Simon_olafu, Dim:646x646, nnz:556, id=6 Name:Cannizzo_sts498, Dim:498x498, nnz:7356, id=3 log r log r x 4 8 x 5 Name:Simon_olafu, Dim:646x646, nnz:556, id= Name:Cannizzo_sts498, Dim:498x498, nnz:7356, id= x x x Fig Comparison of residual and solution norms for and solving two spd systems Ax = b. These are typical examples. Left: Problem Simon olafu with n = 646 and cond(a) = 4.3E+8. Right: Problem Cannizzo sts498 with n = 498 and cond(a) = 6.7E+3. Top: The values of log r k are plotted against iteration number k. Bottom: The values of x k are plotted against k. The solution norms grow somewhat faster for than for. Both reach the limiting value x significantly before x k is close to x. Thus on average, rk M / rm k will be closer to if l is large. This means that the larger l is (in absolute terms), the more will lag behind (a bigger gap between rk C and rm k ). 4.. Indefinite systems. A key part of Steihaug s trust-region method for large-scale unconstrained optimization [] (see also [4]) is his proof that when is applied to a symmetric (possibly indefinite) system Ax = b, the solution norms x,..., x k are strictly increasing as long as p T j Ap j > for all iterations j k. (We are using the notation in Table..)

12 DAVID FONG AND MICHAEL SAUNDERS Name:Schmid_thermal, Dim:8654x8654, nnz:574458, id=4 Name:BenElechi_BenElechi, Dim:45874x45874, nnz:35496, id= log r log r Name:Schmid_thermal, Dim:8654x8654, nnz:574458, id= x 4 Name:BenElechi_BenElechi, Dim:45874x45874, nnz:35496, id= x x x 4 Fig Comparison of residual and solution norms for and solving two more spd systems Ax = b. Sometimes the solution norms take longer to reach the limiting value x. Left: Problem Schmid thermal with n = 8654 and cond(a) = 3.E+5. Right: Problem BenElechi BenElechi with n = and cond(a) =.8E+9. Top: The values of log r k are plotted against iteration number k. Bottom: The values of x k are plotted against k. Again the solution norms grow faster for. From our proof of Theorem., we see that the same property holds for CR and as long as both p T j Ap j > and r T j Ar j > for all iterations j k. In case future research finds that is a useful solver in the trust-region context, it is of interest now to offer some empirical results about the behavior of x k when is applied to indefinite systems.

13 VERSUS x k r k / x k Iteration number Iteration number Fig For on the indefinite problem (4.), x k and the backward error r k / x k are both slightly non-monotonic. First, on the nonsingular indefinite system x =, (4.) gives non-monotonic solution norms, as shown in the left plot of Figure 4.6. The decrease in x k implies that the backward errors r k / x k may not be monotonic, as illustrated in the right plot. More generally, we can gain an impression of the behavior of x k by recalling from Choi et al. [3] the connection between and -QLP. Both methods compute the iterates x M k = V kyk M in (.) from the subproblems y M k = arg min y R k T k y β e and possibly T l y M l = β e. When A is nonsingular or Ax = b is consistent (which we now assume), yk M is uniquely defined for each k l and the methods compute the same iterates x M k (but by different numerical methods). In fact they both compute the expanding QR factorizations [ ] [ ] Rk t Q k Tk β e = k, φ k (with R k upper tridiagonal) and -QLP also computes the orthogonal factorizations R k P k = L k (with L k lower tridiagonal), from which the kth solution estimate is defined by W k = V k P k, L k u k = t k, and x M k = W ku k. As shown in [3, 5.3], the construction of these quantities is such that the first k 3 columns of W k are the same as in W k, and the first k 3 elements of u k are the same as in u k. Since W k has orthonormal columns, x M k = u k, where the first k elements of u k are unaltered by later iterations. As shown in [3, 6.5], it means that certain quantities can be cheaply updated to give norm estimates in the form χ χ + ˆµ k, x M k = χ + µ k + µ k, where it is clear that χ increases monotonically. Although the last two terms are of unpredictable size, x M k tends to be dominated by the monotonic term χ and we can expect that x M k will be approximately monotonic as k increases from to l.

14 4 DAVID FONG AND MICHAEL SAUNDERS Simon_olafu, Dim:646x646, nnz:556, id=3 Cannizzo_sts498, Dim:498x498, nnz:7356, id= log r log r Simon_olafu, Dim:646x646, nnz:556, id= Cannizzo_sts498, Dim:498x498, nnz:7356, id= x x Fig Residual norms and solution norms when is applied to two indefinite systems (A δi)x = b, where A is the spd matrices used in Figure 4.4 and δ =.5 is large enough to make the systems indefinite. Left: Problem Simon olafu with n = 646. Right: Problem Cannizzo sts498 with n = 498. Top: The values of log r k are plotted against iteration number k for the first n iterations. Bottom left: The values of x k are plotted against k. During the n = 646 iterations, x k increased 83% of the time and the backward errors r k / x k (not shown) decreased 96% of the time. Bottom right: During the n = 498 iterations, x k increased 9% of the time and the backward errors r k / x k (not shown) decreased 98% of the time. Experimentally we find that for most iterations on an indefinite problem, x k does increase. To obtain indefinite examples that were sensibly scaled, we used the four spd (A, b) cases in Figures , applied diagonal scaling as before, and solved (A δi)x = b with δ =.5 and where A and b are now scaled (so that diag(a) = I). The number of iterations increased significantly but was limited to n. Figure 4.7 shows log r k and x k for the first two cases (where A is the spd matrices in Figure 4.4). The values of x k are essentially monotonic. The backward errors r k / x k (not shown) were even closer to being monotonic (at least for the first n iterations).

15 VERSUS 5 Schmid_thermal, Dim:8654x8654, nnz:574458, id=4 BenElechi_BenElechi, Dim:45874x45874, nnz:35496, id= log r log r x x 5 4 Schmid_thermal, Dim:8654x8654, nnz:574458, id=4 4 BenElechi_BenElechi, Dim:45874x45874, nnz:35496, id= x x x x 5 Fig Residual norms and solution norms when is applied to two indefinite systems (A δi)x = b, where A is the spd matrices used in Figure 4.5 and δ =.5 is large enough to make the systems indefinite. Left: Problem Schmid thermal with n = Right: Problem BenElechi BenElechi with n = Top: The values of log r k are plotted against iteration number k for the first n iterations. Bottom left: The values of x k are plotted against k. There is a mild but clear decrease in x k over an interval of about iterations. During the n = 8654 iterations, x k increased 83% of the time and the backward errors r k / x k (not shown) decreased 9% of the time. Bottom right: The solution norms and backward errors are essentially monotonic. During the n = iterations, x k increased 88% of the time and the backward errors r k / x k (not shown) decreased 95% of the time. Figure 4.8 shows x k and log r k for the second two cases (where A is the spd matrices in Figure 4.5). The left example reveals a definite period of decrease in x k. Nevertheless, during the n = 8654 iterations, x k increased 83% of the time and the backward errors r k / x k decreased 9% of the time. The right example is more like those in Figure 4.7. During n = iterations, x k increased 83% of the time, the backward errors r k / x k decreased 9% of the time, and any nonmonotonicity was very slight.

16 6 DAVID FONG AND MICHAEL SAUNDERS Table 5. Comparison of and properties on an spd system Ax = b. x k [, Thm.] (Thm.3) x x k [, Thm 4:3] (Thm.4) [, Thm 7:5] x x k A [, Thm 6:3] (Thm.5) [, Thm 7:4] r k not-monotonic [8] [, Thm 7:] r k / x k not-monotonic (Thm 3.) monotonically increasing monotonically decreasing Table 5. Comparison of LSQR and LSMR properties on min Ax b. LSQR LSMR x k [7, Thm 3.3.] [7, Thm 3.3.6] x x k [7, Thm 3.3.] [7, Thm 3.3.7] r r k [7, Thm 3.3.3] [7, Thm 3.3.8] A T r k not-monotonic [8, 3.] r k [9, 5.] [7, Thm 3.3.] ( A T r k / r k ) LSQR ( A T r k / r k ) LSMR x k converges to minimum-norm x for singular systems 5. Conclusions. For full-rank least-squares problems min Ax b, the solvers LSQR [9, ] and LSMR [8, 4] are equivalent to and on the (spd) normal equation A T Ax = A T b. Comparisons in [8] indicated that LSMR can often stop much sooner than LSQR when the stopping rule is based on Stewart s backward error norm A T r k / r k for least-squares problems []. Our theoretical and experimental results here provide analogous evidence that can often stop much sooner than on spd systems when the stopping rule is based on the backward error r k / x k for Ax = b (or the more general backward errors in Theorem 3.). In some cases, can converge faster than by as much as orders of magnitude (Figure 4.3). On the other hand, converges somewhat faster than in terms of both x x k A and x x k (same figure). For spd systems, Table 5. summarizes properties that were already known by Hestenes and Stiefel [] and Steihaug [], along with the two additional properties of that we proved here (Theorems.3 and 3.). These theorems and experiments on and are part of the first author s PhD thesis [7], which also discusses LSQR and LSMR and derives some new results for both solvers. Table 5. summarizes the known results for LSQR and LSMR (in [9] and [8] respectively) and the newly derived properties for both solvers (in [7]). Acknowledgements. We kindly thank Professor Mehiddin Al-Baali and other colleagues for organizing the Second International Conference on Numerical Analysis and Optimization at Sultan Qaboos University (January 3 6,, Muscat, Sultanate of Oman). Their wish to publish some of the conference papers in a special issue of SQU Journal for Science gave added motivation for this research. We also thank Dr Sou-Cheng Choi for many helpful discussions of the iterative solvers, and Michael Friedlander for a final welcome tip.

17 VERSUS 7 REFERENCES [] M. Arioli, A stopping criterion for the conjugate gradient algorithm in a finite element method framework, Numer. Math., 97 (4), pp. 4. [] S.-C. Choi, Iterative Methods for Singular Linear Equations and Least-Squares Problems, PhD thesis, Stanford University, Stanford, CA, December 6. [3] S.-C. Choi, C. C. Paige, and M. A. Saunders, -QLP: a Krylov subspace method for indefinite or singular symmetric systems, SIAM J. Sci. Comput., 33 (), pp [4] A. R. Conn, N. I. M. Gould, and Ph. L. Toint, Trust-Region Methods, vol., SIAM, Philadelphia,. [5] T. A. Davis, University of Florida Sparse Matrix Collection. research/sparse/matrices. [6] J. J. Dongarra, J. R. Bunch, C. B. Moler, and G. W. Stewart, LINPACK Users Guide, SIAM, Philadelphia, 979. [7] D. C.-L. Fong, Minimum-Residual Methods for Sparse Least-Squares Using Golub-Kahan Bidiagonalization, PhD thesis, Stanford University, Stanford, CA, December. [8] D. C.-L. Fong and M. A. Saunders, LSMR: An iterative algorithm for sparse least-squares problems, SIAM J. Sci. Comput., 33 (), pp [9] R. W. Freund, G. H. Golub, and N. M. Nachtigal, Iterative solution of linear systems, Acta Numerica, (99), pp. 57. [] A. Greenbaum, Iterative Methods for Solving Linear Systems, SIAM, Philadelphia, 997. [] M. R. Hestenes and E. Stiefel, Methods of conjugate gradients for solving linear systems, J. Res. Nat. Bur. Standards, 49 (95), pp [] N. J. Higham, Accuracy and Stability of Numerical Algorithms, SIAM, Philadelphia, second ed.,. [3] C. Lanczos, An iteration method for the solution of the eigenvalue problem of linear differential and integral operators, J. Res. Nat. Bur. Standards, 45 (95), pp [4] LSMR software for linear systems and least squares. software.html. [5] D. G. Luenberger, Hyperbolic pairs in the method of conjugate gradients, SIAM J. Appl. Math., 7 (969), pp [6], The conjugate residual method for constrained minimization problems, SIAM J. Numer. Anal., 7 (97), pp [7] G. A. Meurant, The Lanczos and Conjugate Gradient Algorithms: From Theory to Finite Precision Computations, vol. 9 of Software, Environments, and Tools, SIAM, Philadelphia, 6. [8] C. C. Paige and M. A. Saunders, Solution of sparse indefinite systems of linear equations, SIAM J. Numer. Anal., (975), pp [9], LSQR: An algorithm for sparse linear equations and sparse least squares, ACM Trans. Math. Softw., 8 (98), pp [], Algorithm 583; LSQR: Sparse linear equations and least-squares problems, ACM Trans. Math. Softw., 8 (98), pp [] T. Steihaug, The conjugate gradient method and trust regions in large scale optimization, SIAM J. Numer. Anal., (983), pp [] G. W. Stewart, Research, development and LINPACK, in Mathematical Software III, J. R. Rice, ed., Academic Press, New York, 977, pp. 4. [3] E. Stiefel, Relaxationsmethoden bester strategie zur lösung linearer gleichungssysteme, Comm. Math. Helv., 9 (955), pp [4] Y. Sun, The Filter Algorithm for Solving Large-Scale Eigenproblems from Accelerator Simulations, PhD thesis, Stanford University, Stanford, CA, March 3. [5] D. Titley-Peloquin, Backward Perturbation Analysis of Least Squares Problems, PhD thesis, School of Computer Science, McGill University,. [6], Convergence of backward error bounds in LSQR and LSMR. Private communication,. [7] H. A. van der Vorst, Iterative Krylov Methods for Large Linear Systems, Cambridge University Press, Cambridge, first ed., 3. [8] D. S. Watkins, Fundamentals of Matrix Computations, Pure and Applied Mathematics, Wiley, Hoboken, NJ, third ed.,.

CG VERSUS MINRES: AN EMPIRICAL COMPARISON

CG VERSUS MINRES: AN EMPIRICAL COMPARISON VERSUS : AN EMPIRICAL COMPARISON DAVID CHIN-LUNG FONG AND MICHAEL SAUNDERS Abstract. For iterative solution of symmetric systems Ax = b, the conjugate gradient method () is commonly used when A is positive

More information

CG versus MINRES on Positive Definite Systems

CG versus MINRES on Positive Definite Systems School of Engineering CG versus MINRES on Positive Definite Systems Michael Saunders SOL and ICME, Stanford University Joint work with David Fong Recent Advances on Optimization In honor of Philippe Toint

More information

CG and MINRES: An empirical comparison

CG and MINRES: An empirical comparison School of Engineering CG and MINRES: An empirical comparison Prequel to LSQR and LSMR: Two least-squares solvers David Fong and Michael Saunders Institute for Computational and Mathematical Engineering

More information

Experiments with iterative computation of search directions within interior methods for constrained optimization

Experiments with iterative computation of search directions within interior methods for constrained optimization Experiments with iterative computation of search directions within interior methods for constrained optimization Santiago Akle and Michael Saunders Institute for Computational Mathematics and Engineering

More information

Key words. conjugate gradients, normwise backward error, incremental norm estimation.

Key words. conjugate gradients, normwise backward error, incremental norm estimation. Proceedings of ALGORITMY 2016 pp. 323 332 ON ERROR ESTIMATION IN THE CONJUGATE GRADIENT METHOD: NORMWISE BACKWARD ERROR PETR TICHÝ Abstract. Using an idea of Duff and Vömel [BIT, 42 (2002), pp. 300 322

More information

The Lanczos and conjugate gradient algorithms

The Lanczos and conjugate gradient algorithms The Lanczos and conjugate gradient algorithms Gérard MEURANT October, 2008 1 The Lanczos algorithm 2 The Lanczos algorithm in finite precision 3 The nonsymmetric Lanczos algorithm 4 The Golub Kahan bidiagonalization

More information

Block Bidiagonal Decomposition and Least Squares Problems

Block Bidiagonal Decomposition and Least Squares Problems Block Bidiagonal Decomposition and Least Squares Problems Åke Björck Department of Mathematics Linköping University Perspectives in Numerical Analysis, Helsinki, May 27 29, 2008 Outline Bidiagonal Decomposition

More information

RESIDUAL SMOOTHING AND PEAK/PLATEAU BEHAVIOR IN KRYLOV SUBSPACE METHODS

RESIDUAL SMOOTHING AND PEAK/PLATEAU BEHAVIOR IN KRYLOV SUBSPACE METHODS RESIDUAL SMOOTHING AND PEAK/PLATEAU BEHAVIOR IN KRYLOV SUBSPACE METHODS HOMER F. WALKER Abstract. Recent results on residual smoothing are reviewed, and it is observed that certain of these are equivalent

More information

AMS subject classifications. 15A06, 65F10, 65F20, 65F22, 65F25, 65F35, 65F50, 93E24

AMS subject classifications. 15A06, 65F10, 65F20, 65F22, 65F25, 65F35, 65F50, 93E24 : AN IERAIVE ALGORIHM FOR SPARSE LEAS-SQUARES PROBLEMS DAVID CHIN-LUNG FONG AND MICHAEL SAUNDERS Abstract. An iterative method is presented for solving linear systems Ax = b and leastsquares problems min

More information

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit V: Eigenvalue Problems Lecturer: Dr. David Knezevic Unit V: Eigenvalue Problems Chapter V.4: Krylov Subspace Methods 2 / 51 Krylov Subspace Methods In this chapter we give

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 18 Outline

More information

Contribution of Wo¹niakowski, Strako²,... The conjugate gradient method in nite precision computa

Contribution of Wo¹niakowski, Strako²,... The conjugate gradient method in nite precision computa Contribution of Wo¹niakowski, Strako²,... The conjugate gradient method in nite precision computations ªaw University of Technology Institute of Mathematics and Computer Science Warsaw, October 7, 2006

More information

Solving large sparse Ax = b.

Solving large sparse Ax = b. Bob-05 p.1/69 Solving large sparse Ax = b. Stopping criteria, & backward stability of MGS-GMRES. Chris Paige (McGill University); Miroslav Rozložník & Zdeněk Strakoš (Academy of Sciences of the Czech Republic)..pdf

More information

Generalized MINRES or Generalized LSQR?

Generalized MINRES or Generalized LSQR? Generalized MINRES or Generalized LSQR? Michael Saunders Systems Optimization Laboratory (SOL) Institute for Computational Mathematics and Engineering (ICME) Stanford University New Frontiers in Numerical

More information

1 Conjugate gradients

1 Conjugate gradients Notes for 2016-11-18 1 Conjugate gradients We now turn to the method of conjugate gradients (CG), perhaps the best known of the Krylov subspace solvers. The CG iteration can be characterized as the iteration

More information

LSMR: An iterative algorithm for least-squares problems

LSMR: An iterative algorithm for least-squares problems LSMR: An iterative algorithm for least-squares problems David Fong Michael Saunders Institute for Computational and Mathematical Engineering (icme) Stanford University Copper Mountain Conference on Iterative

More information

Lanczos tridigonalization and Golub - Kahan bidiagonalization: Ideas, connections and impact

Lanczos tridigonalization and Golub - Kahan bidiagonalization: Ideas, connections and impact Lanczos tridigonalization and Golub - Kahan bidiagonalization: Ideas, connections and impact Zdeněk Strakoš Academy of Sciences and Charles University, Prague http://www.cs.cas.cz/ strakos Hong Kong, February

More information

MS&E 318 (CME 338) Large-Scale Numerical Optimization

MS&E 318 (CME 338) Large-Scale Numerical Optimization Stanford University, Management Science & Engineering (and ICME MS&E 38 (CME 338 Large-Scale Numerical Optimization Course description Instructor: Michael Saunders Spring 28 Notes : Review The course teaches

More information

Some minimization problems

Some minimization problems Week 13: Wednesday, Nov 14 Some minimization problems Last time, we sketched the following two-step strategy for approximating the solution to linear systems via Krylov subspaces: 1. Build a sequence of

More information

Lecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm

Lecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm CS 622 Data-Sparse Matrix Computations September 19, 217 Lecture 9: Krylov Subspace Methods Lecturer: Anil Damle Scribes: David Eriksson, Marc Aurele Gilles, Ariah Klages-Mundt, Sophia Novitzky 1 Introduction

More information

Conjugate Gradient algorithm. Storage: fixed, independent of number of steps.

Conjugate Gradient algorithm. Storage: fixed, independent of number of steps. Conjugate Gradient algorithm Need: A symmetric positive definite; Cost: 1 matrix-vector product per step; Storage: fixed, independent of number of steps. The CG method minimizes the A norm of the error,

More information

On fast trust region methods for quadratic models with linear constraints. M.J.D. Powell

On fast trust region methods for quadratic models with linear constraints. M.J.D. Powell DAMTP 2014/NA02 On fast trust region methods for quadratic models with linear constraints M.J.D. Powell Abstract: Quadratic models Q k (x), x R n, of the objective function F (x), x R n, are used by many

More information

Nearest Correlation Matrix

Nearest Correlation Matrix Nearest Correlation Matrix The NAG Library has a range of functionality in the area of computing the nearest correlation matrix. In this article we take a look at nearest correlation matrix problems, giving

More information

MS&E 318 (CME 338) Large-Scale Numerical Optimization

MS&E 318 (CME 338) Large-Scale Numerical Optimization Stanford University, Management Science & Engineering (and ICME) MS&E 318 (CME 338) Large-Scale Numerical Optimization 1 Origins Instructor: Michael Saunders Spring 2015 Notes 9: Augmented Lagrangian Methods

More information

On the accuracy of saddle point solvers

On the accuracy of saddle point solvers On the accuracy of saddle point solvers Miro Rozložník joint results with Valeria Simoncini and Pavel Jiránek Institute of Computer Science, Czech Academy of Sciences, Prague, Czech Republic Seminar at

More information

6.4 Krylov Subspaces and Conjugate Gradients

6.4 Krylov Subspaces and Conjugate Gradients 6.4 Krylov Subspaces and Conjugate Gradients Our original equation is Ax = b. The preconditioned equation is P Ax = P b. When we write P, we never intend that an inverse will be explicitly computed. P

More information

Total least squares. Gérard MEURANT. October, 2008

Total least squares. Gérard MEURANT. October, 2008 Total least squares Gérard MEURANT October, 2008 1 Introduction to total least squares 2 Approximation of the TLS secular equation 3 Numerical experiments Introduction to total least squares In least squares

More information

A general Krylov method for solving symmetric systems of linear equations

A general Krylov method for solving symmetric systems of linear equations A general Krylov method for solving symmetric systems of linear equations Anders FORSGREN Tove ODLAND Technical Report TRITA-MAT-214-OS-1 Department of Mathematics KTH Royal Institute of Technology March

More information

Numerical behavior of inexact linear solvers

Numerical behavior of inexact linear solvers Numerical behavior of inexact linear solvers Miro Rozložník joint results with Zhong-zhi Bai and Pavel Jiránek Institute of Computer Science, Czech Academy of Sciences, Prague, Czech Republic The fourth

More information

APPLIED NUMERICAL LINEAR ALGEBRA

APPLIED NUMERICAL LINEAR ALGEBRA APPLIED NUMERICAL LINEAR ALGEBRA James W. Demmel University of California Berkeley, California Society for Industrial and Applied Mathematics Philadelphia Contents Preface 1 Introduction 1 1.1 Basic Notation

More information

On Lagrange multipliers of trust region subproblems

On Lagrange multipliers of trust region subproblems On Lagrange multipliers of trust region subproblems Ladislav Lukšan, Ctirad Matonoha, Jan Vlček Institute of Computer Science AS CR, Prague Applied Linear Algebra April 28-30, 2008 Novi Sad, Serbia Outline

More information

Numerical Methods in Matrix Computations

Numerical Methods in Matrix Computations Ake Bjorck Numerical Methods in Matrix Computations Springer Contents 1 Direct Methods for Linear Systems 1 1.1 Elements of Matrix Theory 1 1.1.1 Matrix Algebra 2 1.1.2 Vector Spaces 6 1.1.3 Submatrices

More information

Notes on Some Methods for Solving Linear Systems

Notes on Some Methods for Solving Linear Systems Notes on Some Methods for Solving Linear Systems Dianne P. O Leary, 1983 and 1999 and 2007 September 25, 2007 When the matrix A is symmetric and positive definite, we have a whole new class of algorithms

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT 16-02 The Induced Dimension Reduction method applied to convection-diffusion-reaction problems R. Astudillo and M. B. van Gijzen ISSN 1389-6520 Reports of the Delft

More information

On Lagrange multipliers of trust-region subproblems

On Lagrange multipliers of trust-region subproblems On Lagrange multipliers of trust-region subproblems Ladislav Lukšan, Ctirad Matonoha, Jan Vlček Institute of Computer Science AS CR, Prague Programy a algoritmy numerické matematiky 14 1.- 6. června 2008

More information

The Conjugate Gradient Method for Solving Linear Systems of Equations

The Conjugate Gradient Method for Solving Linear Systems of Equations The Conjugate Gradient Method for Solving Linear Systems of Equations Mike Rambo Mentor: Hans de Moor May 2016 Department of Mathematics, Saint Mary s College of California Contents 1 Introduction 2 2

More information

Introduction to Iterative Solvers of Linear Systems

Introduction to Iterative Solvers of Linear Systems Introduction to Iterative Solvers of Linear Systems SFB Training Event January 2012 Prof. Dr. Andreas Frommer Typeset by Lukas Krämer, Simon-Wolfgang Mages and Rudolf Rödl 1 Classes of Matrices and their

More information

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES 48 Arnoldi Iteration, Krylov Subspaces and GMRES We start with the problem of using a similarity transformation to convert an n n matrix A to upper Hessenberg form H, ie, A = QHQ, (30) with an appropriate

More information

Reduced Synchronization Overhead on. December 3, Abstract. The standard formulation of the conjugate gradient algorithm involves

Reduced Synchronization Overhead on. December 3, Abstract. The standard formulation of the conjugate gradient algorithm involves Lapack Working Note 56 Conjugate Gradient Algorithms with Reduced Synchronization Overhead on Distributed Memory Multiprocessors E. F. D'Azevedo y, V.L. Eijkhout z, C. H. Romine y December 3, 1999 Abstract

More information

ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems

ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems. Part I: Review of basic theory of eigenvalue problems ECS231 Handout Subspace projection methods for Solving Large-Scale Eigenvalue Problems Part I: Review of basic theory of eigenvalue problems 1. Let A C n n. (a) A scalar λ is an eigenvalue of an n n A

More information

A Study on Trust Region Update Rules in Newton Methods for Large-scale Linear Classification

A Study on Trust Region Update Rules in Newton Methods for Large-scale Linear Classification JMLR: Workshop and Conference Proceedings 1 16 A Study on Trust Region Update Rules in Newton Methods for Large-scale Linear Classification Chih-Yang Hsia r04922021@ntu.edu.tw Dept. of Computer Science,

More information

Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method

Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method Leslie Foster 11-5-2012 We will discuss the FOM (full orthogonalization method), CG,

More information

Do You Trust Derivatives or Differences?

Do You Trust Derivatives or Differences? ARGONNE NATIONAL LABORATORY 9700 South Cass Avenue Argonne, Illinois 60439 Do You Trust Derivatives or Differences? Jorge J. Moré and Stefan M. Wild Mathematics and Computer Science Division Preprint ANL/MCS-P2067-0312

More information

On the Superlinear Convergence of MINRES. Valeria Simoncini and Daniel B. Szyld. Report January 2012

On the Superlinear Convergence of MINRES. Valeria Simoncini and Daniel B. Szyld. Report January 2012 On the Superlinear Convergence of MINRES Valeria Simoncini and Daniel B. Szyld Report 12-01-11 January 2012 This report is available in the World Wide Web at http://www.math.temple.edu/~szyld 0 Chapter

More information

Key words. linear equations, polynomial preconditioning, nonsymmetric Lanczos, BiCGStab, IDR

Key words. linear equations, polynomial preconditioning, nonsymmetric Lanczos, BiCGStab, IDR POLYNOMIAL PRECONDITIONED BICGSTAB AND IDR JENNIFER A. LOE AND RONALD B. MORGAN Abstract. Polynomial preconditioning is applied to the nonsymmetric Lanczos methods BiCGStab and IDR for solving large nonsymmetric

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 20 1 / 20 Overview

More information

Approximating the matrix exponential of an advection-diffusion operator using the incomplete orthogonalization method

Approximating the matrix exponential of an advection-diffusion operator using the incomplete orthogonalization method Approximating the matrix exponential of an advection-diffusion operator using the incomplete orthogonalization method Antti Koskela KTH Royal Institute of Technology, Lindstedtvägen 25, 10044 Stockholm,

More information

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH V. FABER, J. LIESEN, AND P. TICHÝ Abstract. Numerous algorithms in numerical linear algebra are based on the reduction of a given matrix

More information

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated. Math 504, Homework 5 Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated 1 Find the eigenvalues and the associated eigenspaces

More information

PERTURBED ARNOLDI FOR COMPUTING MULTIPLE EIGENVALUES

PERTURBED ARNOLDI FOR COMPUTING MULTIPLE EIGENVALUES 1 PERTURBED ARNOLDI FOR COMPUTING MULTIPLE EIGENVALUES MARK EMBREE, THOMAS H. GIBSON, KEVIN MENDOZA, AND RONALD B. MORGAN Abstract. fill in abstract Key words. eigenvalues, multiple eigenvalues, Arnoldi,

More information

COMPUTING PROJECTIONS WITH LSQR*

COMPUTING PROJECTIONS WITH LSQR* BIT 37:1(1997), 96-104. COMPUTING PROJECTIONS WITH LSQR* MICHAEL A. SAUNDERS t Abstract. Systems Optimization Laboratory, Department of EES ~4 OR Stanford University, Stanford, CA 94305-4023, USA. email:

More information

Large-scale eigenvalue problems

Large-scale eigenvalue problems ELE 538B: Mathematics of High-Dimensional Data Large-scale eigenvalue problems Yuxin Chen Princeton University, Fall 208 Outline Power method Lanczos algorithm Eigenvalue problems 4-2 Eigendecomposition

More information

An interior-point method for large constrained discrete ill-posed problems

An interior-point method for large constrained discrete ill-posed problems An interior-point method for large constrained discrete ill-posed problems S. Morigi a, L. Reichel b,,1, F. Sgallari c,2 a Dipartimento di Matematica, Università degli Studi di Bologna, Piazza Porta S.

More information

FEM and sparse linear system solving

FEM and sparse linear system solving FEM & sparse linear system solving, Lecture 9, Nov 19, 2017 1/36 Lecture 9, Nov 17, 2017: Krylov space methods http://people.inf.ethz.ch/arbenz/fem17 Peter Arbenz Computer Science Department, ETH Zürich

More information

AN ITERATIVE METHOD WITH ERROR ESTIMATORS

AN ITERATIVE METHOD WITH ERROR ESTIMATORS AN ITERATIVE METHOD WITH ERROR ESTIMATORS D. CALVETTI, S. MORIGI, L. REICHEL, AND F. SGALLARI Abstract. Iterative methods for the solution of linear systems of equations produce a sequence of approximate

More information

Golub-Kahan iterative bidiagonalization and determining the noise level in the data

Golub-Kahan iterative bidiagonalization and determining the noise level in the data Golub-Kahan iterative bidiagonalization and determining the noise level in the data Iveta Hnětynková,, Martin Plešinger,, Zdeněk Strakoš, * Charles University, Prague ** Academy of Sciences of the Czech

More information

Gradient Method Based on Roots of A

Gradient Method Based on Roots of A Journal of Scientific Computing, Vol. 15, No. 4, 2000 Solving Ax Using a Modified Conjugate Gradient Method Based on Roots of A Paul F. Fischer 1 and Sigal Gottlieb 2 Received January 23, 2001; accepted

More information

Conjugate Gradient Tutorial

Conjugate Gradient Tutorial Conjugate Gradient Tutorial Prof. Chung-Kuan Cheng Computer Science and Engineering Department University of California, San Diego ckcheng@ucsd.edu December 1, 2015 Prof. Chung-Kuan Cheng (UC San Diego)

More information

ITERATIVE METHODS BASED ON KRYLOV SUBSPACES

ITERATIVE METHODS BASED ON KRYLOV SUBSPACES ITERATIVE METHODS BASED ON KRYLOV SUBSPACES LONG CHEN We shall present iterative methods for solving linear algebraic equation Au = b based on Krylov subspaces We derive conjugate gradient (CG) method

More information

Tikhonov Regularization of Large Symmetric Problems

Tikhonov Regularization of Large Symmetric Problems NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. 2000; 00:1 11 [Version: 2000/03/22 v1.0] Tihonov Regularization of Large Symmetric Problems D. Calvetti 1, L. Reichel 2 and A. Shuibi

More information

Linear algebra issues in Interior Point methods for bound-constrained least-squares problems

Linear algebra issues in Interior Point methods for bound-constrained least-squares problems Linear algebra issues in Interior Point methods for bound-constrained least-squares problems Stefania Bellavia Dipartimento di Energetica S. Stecco Università degli Studi di Firenze Joint work with Jacek

More information

F as a function of the approximate solution x

F as a function of the approximate solution x ESTIMATES OF OPTIMAL BACKWARD PERTURBATIONS FOR LINEAR LEAST SQUARES PROBLEMS JOSEPH F. GRCAR, MICHAEL A. SAUNDERS, AND ZHENG SU Abstract. Numerical tests are used to validate a practical estimate for

More information

On the loss of orthogonality in the Gram-Schmidt orthogonalization process

On the loss of orthogonality in the Gram-Schmidt orthogonalization process CERFACS Technical Report No. TR/PA/03/25 Luc Giraud Julien Langou Miroslav Rozložník On the loss of orthogonality in the Gram-Schmidt orthogonalization process Abstract. In this paper we study numerical

More information

On the influence of eigenvalues on Bi-CG residual norms

On the influence of eigenvalues on Bi-CG residual norms On the influence of eigenvalues on Bi-CG residual norms Jurjen Duintjer Tebbens Institute of Computer Science Academy of Sciences of the Czech Republic duintjertebbens@cs.cas.cz Gérard Meurant 30, rue

More information

Sensitivity of Gauss-Christoffel quadrature and sensitivity of Jacobi matrices to small changes of spectral data

Sensitivity of Gauss-Christoffel quadrature and sensitivity of Jacobi matrices to small changes of spectral data Sensitivity of Gauss-Christoffel quadrature and sensitivity of Jacobi matrices to small changes of spectral data Zdeněk Strakoš Academy of Sciences and Charles University, Prague http://www.cs.cas.cz/

More information

A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Squares Problem

A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Squares Problem A Backward Stable Hyperbolic QR Factorization Method for Solving Indefinite Least Suares Problem Hongguo Xu Dedicated to Professor Erxiong Jiang on the occasion of his 7th birthday. Abstract We present

More information

Reduced-Hessian Methods for Constrained Optimization

Reduced-Hessian Methods for Constrained Optimization Reduced-Hessian Methods for Constrained Optimization Philip E. Gill University of California, San Diego Joint work with: Michael Ferry & Elizabeth Wong 11th US & Mexico Workshop on Optimization and its

More information

Inexactness and flexibility in linear Krylov solvers

Inexactness and flexibility in linear Krylov solvers Inexactness and flexibility in linear Krylov solvers Luc Giraud ENSEEIHT (N7) - IRIT, Toulouse Matrix Analysis and Applications CIRM Luminy - October 15-19, 2007 in honor of Gérard Meurant for his 60 th

More information

7.3 The Jacobi and Gauss-Siedel Iterative Techniques. Problem: To solve Ax = b for A R n n. Methodology: Iteratively approximate solution x. No GEPP.

7.3 The Jacobi and Gauss-Siedel Iterative Techniques. Problem: To solve Ax = b for A R n n. Methodology: Iteratively approximate solution x. No GEPP. 7.3 The Jacobi and Gauss-Siedel Iterative Techniques Problem: To solve Ax = b for A R n n. Methodology: Iteratively approximate solution x. No GEPP. 7.3 The Jacobi and Gauss-Siedel Iterative Techniques

More information

For δa E, this motivates the definition of the Bauer-Skeel condition number ([2], [3], [14], [15])

For δa E, this motivates the definition of the Bauer-Skeel condition number ([2], [3], [14], [15]) LAA 278, pp.2-32, 998 STRUCTURED PERTURBATIONS AND SYMMETRIC MATRICES SIEGFRIED M. RUMP Abstract. For a given n by n matrix the ratio between the componentwise distance to the nearest singular matrix and

More information

Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright.

Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright. Solutions and Notes to Selected Problems In: Numerical Optimzation by Jorge Nocedal and Stephen J. Wright. John L. Weatherwax July 7, 2010 wax@alum.mit.edu 1 Chapter 5 (Conjugate Gradient Methods) Notes

More information

On the Modification of an Eigenvalue Problem that Preserves an Eigenspace

On the Modification of an Eigenvalue Problem that Preserves an Eigenspace Purdue University Purdue e-pubs Department of Computer Science Technical Reports Department of Computer Science 2009 On the Modification of an Eigenvalue Problem that Preserves an Eigenspace Maxim Maumov

More information

Computing least squares condition numbers on hybrid multicore/gpu systems

Computing least squares condition numbers on hybrid multicore/gpu systems Computing least squares condition numbers on hybrid multicore/gpu systems M. Baboulin and J. Dongarra and R. Lacroix Abstract This paper presents an efficient computation for least squares conditioning

More information

Moments, Model Reduction and Nonlinearity in Solving Linear Algebraic Problems

Moments, Model Reduction and Nonlinearity in Solving Linear Algebraic Problems Moments, Model Reduction and Nonlinearity in Solving Linear Algebraic Problems Zdeněk Strakoš Charles University, Prague http://www.karlin.mff.cuni.cz/ strakos 16th ILAS Meeting, Pisa, June 2010. Thanks

More information

Finite-choice algorithm optimization in Conjugate Gradients

Finite-choice algorithm optimization in Conjugate Gradients Finite-choice algorithm optimization in Conjugate Gradients Jack Dongarra and Victor Eijkhout January 2003 Abstract We present computational aspects of mathematically equivalent implementations of the

More information

OUTLINE 1. Introduction 1.1 Notation 1.2 Special matrices 2. Gaussian Elimination 2.1 Vector and matrix norms 2.2 Finite precision arithmetic 2.3 Fact

OUTLINE 1. Introduction 1.1 Notation 1.2 Special matrices 2. Gaussian Elimination 2.1 Vector and matrix norms 2.2 Finite precision arithmetic 2.3 Fact Computational Linear Algebra Course: (MATH: 6800, CSCI: 6800) Semester: Fall 1998 Instructors: { Joseph E. Flaherty, aherje@cs.rpi.edu { Franklin T. Luk, luk@cs.rpi.edu { Wesley Turner, turnerw@cs.rpi.edu

More information

Chapter 7 Iterative Techniques in Matrix Algebra

Chapter 7 Iterative Techniques in Matrix Algebra Chapter 7 Iterative Techniques in Matrix Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128B Numerical Analysis Vector Norms Definition

More information

LAPACK-Style Codes for Pivoted Cholesky and QR Updating. Hammarling, Sven and Higham, Nicholas J. and Lucas, Craig. MIMS EPrint: 2006.

LAPACK-Style Codes for Pivoted Cholesky and QR Updating. Hammarling, Sven and Higham, Nicholas J. and Lucas, Craig. MIMS EPrint: 2006. LAPACK-Style Codes for Pivoted Cholesky and QR Updating Hammarling, Sven and Higham, Nicholas J. and Lucas, Craig 2007 MIMS EPrint: 2006.385 Manchester Institute for Mathematical Sciences School of Mathematics

More information

Iterative Methods for Solving A x = b

Iterative Methods for Solving A x = b Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http

More information

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication.

CME342 Parallel Methods in Numerical Analysis. Matrix Computation: Iterative Methods II. Sparse Matrix-vector Multiplication. CME342 Parallel Methods in Numerical Analysis Matrix Computation: Iterative Methods II Outline: CG & its parallelization. Sparse Matrix-vector Multiplication. 1 Basic iterative methods: Ax = b r = b Ax

More information

UNIFYING LEAST SQUARES, TOTAL LEAST SQUARES AND DATA LEAST SQUARES

UNIFYING LEAST SQUARES, TOTAL LEAST SQUARES AND DATA LEAST SQUARES UNIFYING LEAST SQUARES, TOTAL LEAST SQUARES AND DATA LEAST SQUARES Christopher C. Paige School of Computer Science, McGill University, Montreal, Quebec, Canada, H3A 2A7 paige@cs.mcgill.ca Zdeněk Strakoš

More information

Introduction. Chapter One

Introduction. Chapter One Chapter One Introduction The aim of this book is to describe and explain the beautiful mathematical relationships between matrices, moments, orthogonal polynomials, quadrature rules and the Lanczos and

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT 10-12 Large-Scale Eigenvalue Problems in Trust-Region Calculations Marielba Rojas, Bjørn H. Fotland, and Trond Steihaug ISSN 1389-6520 Reports of the Department of

More information

Course Notes: Week 1

Course Notes: Week 1 Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues

More information

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems Topics The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems What about non-spd systems? Methods requiring small history Methods requiring large history Summary of solvers 1 / 52 Conjugate

More information

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294) Conjugate gradient method Descent method Hestenes, Stiefel 1952 For A N N SPD In exact arithmetic, solves in N steps In real arithmetic No guaranteed stopping Often converges in many fewer than N steps

More information

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 0

CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 0 CME 302: NUMERICAL LINEAR ALGEBRA FALL 2005/06 LECTURE 0 GENE H GOLUB 1 What is Numerical Analysis? In the 1973 edition of the Webster s New Collegiate Dictionary, numerical analysis is defined to be the

More information

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A. AMSC/CMSC 661 Scientific Computing II Spring 2005 Solution of Sparse Linear Systems Part 2: Iterative methods Dianne P. O Leary c 2005 Solving Sparse Linear Systems: Iterative methods The plan: Iterative

More information

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.

More information

On the Preconditioning of the Block Tridiagonal Linear System of Equations

On the Preconditioning of the Block Tridiagonal Linear System of Equations On the Preconditioning of the Block Tridiagonal Linear System of Equations Davod Khojasteh Salkuyeh Department of Mathematics, University of Mohaghegh Ardabili, PO Box 179, Ardabil, Iran E-mail: khojaste@umaacir

More information

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725

Numerical Linear Algebra Primer. Ryan Tibshirani Convex Optimization /36-725 Numerical Linear Algebra Primer Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: proximal gradient descent Consider the problem min g(x) + h(x) with g, h convex, g differentiable, and h simple

More information

A SUFFICIENTLY EXACT INEXACT NEWTON STEP BASED ON REUSING MATRIX INFORMATION

A SUFFICIENTLY EXACT INEXACT NEWTON STEP BASED ON REUSING MATRIX INFORMATION A SUFFICIENTLY EXACT INEXACT NEWTON STEP BASED ON REUSING MATRIX INFORMATION Anders FORSGREN Technical Report TRITA-MAT-2009-OS7 Department of Mathematics Royal Institute of Technology November 2009 Abstract

More information

Analysis of Block LDL T Factorizations for Symmetric Indefinite Matrices

Analysis of Block LDL T Factorizations for Symmetric Indefinite Matrices Analysis of Block LDL T Factorizations for Symmetric Indefinite Matrices Haw-ren Fang August 24, 2007 Abstract We consider the block LDL T factorizations for symmetric indefinite matrices in the form LBL

More information

Non-stationary extremal eigenvalue approximations in iterative solutions of linear systems and estimators for relative error

Non-stationary extremal eigenvalue approximations in iterative solutions of linear systems and estimators for relative error on-stationary extremal eigenvalue approximations in iterative solutions of linear systems and estimators for relative error Divya Anand Subba and Murugesan Venatapathi* Supercomputer Education and Research

More information

Preconditioning Techniques Analysis for CG Method

Preconditioning Techniques Analysis for CG Method Preconditioning Techniques Analysis for CG Method Huaguang Song Department of Computer Science University of California, Davis hso@ucdavis.edu Abstract Matrix computation issue for solve linear system

More information

Arnoldi Methods in SLEPc

Arnoldi Methods in SLEPc Scalable Library for Eigenvalue Problem Computations SLEPc Technical Report STR-4 Available at http://slepc.upv.es Arnoldi Methods in SLEPc V. Hernández J. E. Román A. Tomás V. Vidal Last update: October,

More information

LAPACK-Style Codes for Pivoted Cholesky and QR Updating

LAPACK-Style Codes for Pivoted Cholesky and QR Updating LAPACK-Style Codes for Pivoted Cholesky and QR Updating Sven Hammarling 1, Nicholas J. Higham 2, and Craig Lucas 3 1 NAG Ltd.,Wilkinson House, Jordan Hill Road, Oxford, OX2 8DR, England, sven@nag.co.uk,

More information

Solving Sparse Linear Systems: Iterative methods

Solving Sparse Linear Systems: Iterative methods Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/sccs Lecture Notes for Unit VII Sparse Matrix Computations Part 2: Iterative Methods Dianne P. O Leary c 2008,2010

More information

Solving Sparse Linear Systems: Iterative methods

Solving Sparse Linear Systems: Iterative methods Scientific Computing with Case Studies SIAM Press, 2009 http://www.cs.umd.edu/users/oleary/sccswebpage Lecture Notes for Unit VII Sparse Matrix Computations Part 2: Iterative Methods Dianne P. O Leary

More information

Computational Linear Algebra

Computational Linear Algebra Computational Linear Algebra PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2018/19 Part 4: Iterative Methods PD

More information