A general Krylov method for solving symmetric systems of linear equations

Size: px
Start display at page:

Download "A general Krylov method for solving symmetric systems of linear equations"

Transcription

1 A general Krylov method for solving symmetric systems of linear equations Anders FORSGREN Tove ODLAND Technical Report TRITA-MAT-214-OS-1 Department of Mathematics KTH Royal Institute of Technology March 214 Abstract Krylov subspace methods are used for solving systems of linear equations Hx + c =. We present a general Krylov subspace method that can be applied to a symmetric system of linear equations, i.e., for a system in which H = H T. In each iteration, we have a choice of scaling of the orthogonal vectors that successively mae the Krylov subspaces available. We define an extended representation of each such Krylov vector, so that the Krylov vector is represented by a triple. We show that our Krylov subspace method is able to determine if the system of equations is compatible. If so, a solution is computed. If not, a certificate of incompatibility is computed. The method of conjugate gradients is obtained as a special case of our general method. Our framewor gives a way to understand the method of conjugate gradients, in particular when H is not positive) definite. Finally, we derive a minimum-residual method based on our framewor and show how the iterates may be updated explicitly based on the triples. 1. Introduction An important problem in numerical linear algebra is to solve a system of equations where the matrix is symmetric. Such a problem may be posed as Hx + c =, 1.1) for x R n, with c R n and H = H T R n n. Our primary motivation comes from KKT systems arising in optimization, where H is symmetric but in general indefinite, see, e.g., [5], but there are many other applications. Throughout, H is assumed symmetric, any other assumptions on H at particular instances will be stated explicitly. It is also assumed throughout that c. A strategy for a class of methods for solving 1.1) is to generate linearly independent vectors, q, =, 1,... until q becomes linearly dependent for some = r n and hence q r =. These methods will have finite termination properties. Optimization and Systems Theory, Department of Mathematics, KTH Royal Institute of Technology, SE-1 44 Stocholm, Sweden andersf@th.se,odland@th.se). Research partially supported by the Swedish Research Council VR).

2 2 A general Krylov method for solving symmetric systems of linear equations In this paper we consider a general Krylov subspace method in which the generated vectors form an orthogonal basis for the Krylov subspaces generated by H and c, K c, H) = {}, K c, H) = span{c, Hc, H 2 c,..., H 1 c}, = 1, 2, ) In our method, the process by which these vectors are generated bear much resemblance to the Lanczos process, with the difference that the available scaling parameter is left explicitly undecided. The method of conjugate gradients is derived as a special case of this general Krylov method. In addition, the minimum-residual method and explicit recursions for the minimum-residual iterates can be derived based on our framewor. There have been very many contributions to the theory regarding the Lanczos process, the method of conjugate gradients and minimum-residual methods, see, e.g., [6] for an extensive survey of the years and [7]. We will assume exact arithmetic and the theory developed in this paper is based on that assumption. In the end of the paper we briefly discuss computational aspects of our results in finite precision. The outline of the paper is as follows. In Section 2 we define and give a recursion for the Krylov vectors q K +1 c, H), =,..., r. In Section 3 we define triples q, y, δ ) associated with q, =,..., r, such that q = Hy + δ c, q K +1 c, H), y K c, H) and δ R, =,..., r. These triples are defined up to a scaling and recursions for them are given in Proposition 3.1. We then state several results concerning the triples and using these results we devise an algorithm in Section 3.3 that either gives the solution, if 1.1) is compatible in this case we show that δ r ), or gives a certificate of incompatibility in this case δ r = ). The main attribute is that if we do not require δ to attain a specific value or sign, then we do not have to require anything more than symmetry of H. Further, in Section 4, the method of conjugate gradients is illustrated in our framewor, which gives a better way of understanding its behavior, in particular when H is not a positive) definite matrix. Finally, in Section 5, a minimum-residual method applicable also for incompatible systems is illustrated in this framewor and explicit recursions for the for the minimum-residual iterates are given. An algorithm for this method is given in Section Notation The letter i, j and denote integer indices, other lowercase letters such as q, y and c denote column vectors, possibly with super- and/or subscripts. For a symmetric matrix H, H denotes H positive definite. Analogously, H is used to denote H positive semidefinite. N H) denotes the null space of H and RH) denotes the range space of H. We will denote by Z an orthonormal matrix whose columns form a basis for N H). If H is nonsingular, then Z is to be interpreted as an empty matrix. When referring to a norm, the Euclidean norm is used throughout.

3 2. Bacground 3 2. Bacground Regarding 1.1), the raw data available is the matrix H and the vector c and combinations of the two, for example represented by the Krylov subspaces generated by H and c, as defined in 1.2). For an introduction and bacground on Krylov subspaces, see, e.g., [18]. Without loss of generality, the scaling of the first Krylov vector q may be chosen so that q = c. Then one sequence of linearly independent vectors may be generated by letting q K +1 c, H) K c, H), = 1,..., r, such that q, for =, 1,..., r 1 and q r = where r is the minimum index for which q K +1 c, H) K c, H) = {}. These vectors {q, q 1,..., q r 1 } form an orthogonal, hence linearly independent, basis of K r c, H). We will refer to these vectors as the Krylov vectors. With q = c, each vector q, = 1,..., r 1, is uniquely determined up to a scaling. A vector q K +1 c, H) may be expressed as q = j= δ j) Hj c, 2.1) for some parameters δ j), j =,...,. In order to ensure that q = if and only if H c is linearly dependent on c, Hc,..., H 1 c, it must hold that δ ). Also note that for < r, the vectors c, Hc,..., H 1 c are linearly independent. Hence, since q is uniquely determined up to a nonzero scaling, so are δ j), j =,...,. For = r, q r =, and since δ r r), the same argument shows that δ r j), j =,..., r, are uniquely determined up to a common nonzero scaling. This is made precise in Lemma A.1. The following proposition states a recursion for such a sequence of vectors where the scaling factors denoted by {α } r 1 = is left unspecified. The recursion in Proposition 2.1 is a slight generalization of the Lanczos process for generating mutually orthogonal vectors, see [1,11], in which the scaling of each vector q is chosen such that q = 1, =,..., r 1. For completeness, this proposition and its proof is included. Proposition 2.1. Let r denote the smallest positive integer for which K +1 c, H) K c, H) = {}. Given q = c K 1 c, H), there exist vectors q, = 1,..., r, such that q K +1 c, H) K c, H), = 1,..., r, for which q, = 1,..., r 1, and q r =. Each such q, = 1,..., r 1, is uniquely determined up to a scaling, and a sequence {q } r =1 may be generated as q 1 = α Hq + qt Hq ) q T q q, 2.2a) q +1 = α Hq + qt Hq q T q q + qt 1 Hq ) q 1 T q q 1, = 1,..., r 1, 2.2b) 1

4 4 A general Krylov method for solving symmetric systems of linear equations where α, =,..., r 1, are free and nonzero parameters. In addition, it holds that q T +1 q +1 = α q T +1 Hq, =,..., r ) Proof. Given q = c, let be an integer such that 1 r 1. Assume that q i, i =,...,, are mutually orthogonal with q i K i+1 c, H) K i c, H). Let q +1 K +2 c, H) be expressed as q +1 = α Hq + η i) q i, =,..., r 1, 2.4) In order for q +1 to be orthogonal to q i, i =,...,, the parameters η i), i =,...,, are uniquely determined as follows. For =, to have q T q 1 =, it must hold that q T = α Hq q T q, η ) hence obtaining q 1 K 2 c, H) K 1 c, H) as in 2.2a), where α is free and nonzero. For such that 1 r 1, in order to have q T i q +1 =, i =,...,, it must hold that η ) q T = α Hq q T q, η 1) q 1 T = α Hq q 1 T q, and η i) =, i =,..., 2. 1 The last relation follows by the symmetry of H. Hence, obtaining q +1 K +2 c, H) K +1 c, H) as in the three-term recurrence of 2.2b), where α, = 1,..., r 1, are free and nonzero. Since q 1 is orthogonal to q, and since q +1 is orthogonal to q and q 1, = 1,..., r 1, pre-multiplication of 2.2) with q T +1 yields q T +1 q +1 = α q T +1 Hq, =,..., r 1. Finally note that if q +1 is given by 2.2), then the only term that increases the power of H is α Hq ). Since α, repeated use of this argument gives δ +1) +1 if q +1 is expressed by 2.1). In fact, δ +1) +1 = 1) +1 α i. Hence, by Lemma A.1, q +1 = implies K +2 c, H) K +1 c, H) = {}, so that + 1 = r, as required. The choice of a sign in front of the first term of 2.4) is arbitrary. Our choice of minus-sign is made in order to get coherence with existing theory as will be shown later on. Many methods for solving 1.1) are based on using the Lanczos process, in which the scaling factors are chosen such that the generated vectors have norm one and matrix-factorization techniques are used on the symmetric tridiagonal matrix obtained by putting the three-term recurrence on matrix form. For an introduction to how Krylov subspace methods are formalized in this way, see, e.g., [2, 15]. For our purposes we leave these available scaling factors unspecified.

5 3. A general Krylov method 5 3. A general Krylov method 3.1. An extended representation of the Krylov vectors To find a solution of 1.1), if it exists, it is not sufficient to generate the sequence {q } r =1. Note that, as in 2.1), q K +1 c, H), =,..., r, may be expressed as q = j= δ j) Hj c = H j=1 δ j) Hj 1 c ) + δ ) c, = 1,..., r. 3.1) It is not convenient to represent q by 2.1). Therefore, we may define y =, δ = 1, y = j=1 δ j) Hj 1 c K c, H) and δ = δ ), = 1,..., r, so that 3.1) gives q = Hy + δ c. These quantities may be represented by a triple q, y, δ ), =,..., r. Note that {δ j) } j= are given in association with the raw data H and c, the choice we mae is to use only δ ) explicitly and collect all other terms in y. These triples form the basis for the results of this paper. It is straightforward to note that if δ r, then = q r = Hx r + c for x r = 1/δ r )y r, so that x r solves 1.1). It will be shown that 1.1) has a solution if and only if δ r. We will define q, y, δ ) = c,, 1). As mentioned earlier, for a given such that 1 r, the parameters δ j), j = 1,...,, are uniquely defined up to a scaling. Hence, so is the triple q, y, δ ). This is made precise in the recursion given in Proposition 3.1. The terms g and x are reserved for the case when the scaling yields δ = 1, then we denote q, y, δ ) by g, x, 1). Such a scaling is not always well defined. However, as we shall show, there is no need to base the scaling on the value of δ. It is possible to continue the recursion, and let y = Hy 1) + δ 1) c, with 3.2) y 1) = δ j) Hj 2 c K 1 c, H), = 2,..., r, j=2 in addition to y 1) 1 =. This will be used in the analysis, but not in the algorithm presented Results on the triples Based on Proposition 2.1, the following proposition states a recursion for q, y, δ ), = 1,..., r, given q, y, δ ) = c,, 1). Proposition 3.1. Let r denote the smallest positive integer for which K +1 c, H) K c, H) = {}.

6 6 A general Krylov method for solving symmetric systems of linear equations Given q, y, δ ) = c,, 1), there exist vectors q, y, δ ), = 1,..., r, such that q K +1 c, H) K c, H), y K c, H), q = Hy + δ c, = 1,..., r, for which q, = 1,..., r 1, and q r =. Each such q, y, δ ), = 1,..., r, is uniquely determined up to a scalar, and a sequence {q, y, δ )} r =1 may be generated as and y 1 = α q + qt Hq q T q y +1 = α q + qt Hq q T q δ +1 = α q T Hq q T q ) q y, T δ1 = α Hq ) q T q δ, q1 = Hy 1 + δ 1 c, y + qt 1 Hq ) q 1 T q y 1, = 1,..., r 1, 1 δ + qt 1 Hq ) q 1 T q δ 1, = 1,..., r 1, 1 q +1 = Hy +1 + δ +1 c, = 1,..., r 1, where α, =,..., r 1, are free and nonzero parameters. In addition, it holds that y are linearly independent for = 1,..., r. Proof. For =, q = c = Hy +δ c, holds by assumption since y = and δ = 1. For = 1 rewriting the expression for q 1 from Proposition 2.1 yields q 1 = α Hq + qt Hq q T q = α H q + qt Hq q T q ) q = α Hq + qt Hq q T q Hy + δ c) ) = ) q y + T Hq ) α q T q δ c = Hy1 + δ 1 c, where the last equality follows from the expressions for y 1 and δ 1. Hence, q 1 = Hy 1 + δ 1 c. Similarly for = 2,..., r 1, assume that q = Hy + δ c and q 1 = Hy 1 + δ 1 c, then rewriting the expression for q +1 from Proposition 2.1 yields, q +1 = α Hq + qt Hq q T q = α Hq + qt Hq q T q = α H q + qt Hq q T q = Hy +1 + δ +1 c, q + qt 1 Hq ) q 1 T q q 1 = 1 Hy + δ c) + qt 1 Hq q 1 T q Hy 1 + δ 1 c) ) = 1 y + qt 1 Hq q T 1 q 1 y 1 ) + α q T Hq q T q δ + qt 1 Hq ) q 1 T q δ 1 c = 1 for = 2,..., r 1, where the last equality follows from the expressions for y +1 and δ +1, = 1,..., r 1. Hence, q +1 = Hy +1 + δ +1 c, = 1,..., r 1. By Lemma A.1 it holds that for =,..., r, δ j), j =,..., are uniquely determined up to a scaling, hence it follows that y +1 and δ +1, =,..., r 1 are uniquely determined up to a scaling by the recursions of this proposition.

7 3. A general Krylov method 7 Further, note that the recursion for y +1 has a nonzero leading term of q plus additional terms of y i, i = and i = 1. Since q is orthogonal to y i for i and q for < r, it follows that y +1 are linearly independent for =,..., r 1. We will henceforth refer to the triples q, y, δ ), =,..., r, as given by Proposition 3.1. In particular, the triple q r, y r, δ r ), can now be used to show our main convergence result, namely that 1.1) has a solution if and only if δ r, and that the recursions in Proposition 3.1 can be used to find a solution if δ r and a certificate of incompatibility if δ r =. Theorem 3.1. Let q, y, δ ), =,..., r, be given by Proposition 3.1, and let Z denote a matrix whose columns form an orthonormal basis for N H). Then, the following holds for the cases δ r and δ r = respectively. a) If δ r, then Hx r + c = for x r = 1/δ r )y r, so that c RH) and x r solves 1.1). In addition, it holds that Z T y =, =,..., r. b) If δ r =, then y r = δ r 1) ZZ T c, with δ r 1) and Z T c, so that c RH) and 1.1) has no solution. Further, there is a y r 1) K 1 c, H) so that y r = Hy r 1) + δ r 1) c. Hence, HHx 1) r + c) = for x 1) r = 1/δ r 1) )y r 1), so that x 1) r solves min x IR n Hx + c 2 2. If H is nonsingular, then Z is to be interpreted as an empty matrix. Proof. For a), suppose that δ r. Then = q r = Hy r + δ r c, hence Hx r + c = for x r = 1/δ r )y r, i.e., x r = 1/δ r )y r is a solution to 1.1). Since 1.1) has a solution, it must hold that Z T c =. We have y = Hy 1) + δ 1) c for =,..., r, so that Z T y = δ 1) ZT c. As Z T c =, it follows that Z T y =, =,..., r. For b), suppose that δ r =. We have y r = Hy r 1) + δ r 1) c, so that Z T y r = δ r 1) Z T c. If δ r =, then = q r = Hy r so that y r = δ r 1) ZZ T c. It follows from Proposition 3.1 that y r so that δ r 1) and Z T c. A combination of Hy r = and y r = Hy r 1) + δ r 1) c gives HHy r 1) + δ r 1) c) =. Consequently, since δ r 1), it holds that HHx 1) r + c) = for x 1) r = 1/δ r 1) )y r 1). With fx) = 1 2 Hx + c 2 2, one obtains fx) = HHx + c), so that x 1) r is a global minimizer to f over IR n. Hence, we have shown that 1.1) has a solution if and only if δ r, and that the recursions in Proposition 3.1 can be used to find a solution if δ r and a certificate of incompatibility if δ r =. Next we mae a few remars on the behavior of the third component of the triples defined by Proposition 3.1, i.e., the sequence {δ } and on the choice of sign for the nonzero scaling factors {α }. In the following proposition it is shown that the sequence {δ } can not have two zero elements in a row.

8 8 A general Krylov method for solving symmetric systems of linear equations Proposition 3.2. Let q, y, δ ), =,..., r, be given by Proposition 3.1. q and δ =, then If δ +1 = α α 1 Proof. By Proposition 2.1 it holds that q T q q 1 T q δ 1. 1 q T q = α 1 q T Hq 1, and, taing into account δ =, the expression for δ +1 from Proposition 3.1 give q 1 T δ +1 = α Hq q 1 T q δ 1 = α 1 α 1 q T q q 1 T q δ 1, 3.3) 1 giving the required expression for δ +1. It remains to show that δ +1. First, assume that = 1 so that δ 1 =. Then, since α 1, α and δ = 1, 3.3) gives δ 2. Now assume that > 1. Assume, to get a contradiction, that δ +1 =. Then, since α, α 1, 3.3) gives δ 1 =. We may then repeat the same argument to obtain δ i =, i = 1,...,. But this gives a contradiction, as δ 1 = implies δ 2. Hence, it must hold that δ +1, as required. Based on Proposition 3.2 the following corollary states that if α 1 and α have the same sign and δ =, then δ +1 and δ 1 will have opposite signs. Corollary 3.1. Let q, y, δ ), =,..., r, be given by Proposition 3.1 with α 1 and α of the same sign. If q and δ =, then δ +1 δ 1 <. The following lemma states an expression for the triples that will be useful in showing properties of the signs of δ and α for the case when H is positive semidefinite. Lemma 3.1. Let q, y, δ ), =,..., r, be given by Proposition 3.1. If δ and < r, then y+1 δ +1 δ y ) T H y+1 δ +1 δ y ) = α δ +1 δ q T q. 3.4) Proof. Eliminating c from the difference of q +1 and q yields q +1 δ +1 δ q = H y +1 δ +1 δ y ). 3.5) Then pre-multiplication of 3.5) with y +1 δ +1 δ y ) T yields y+1 δ +1 δ y ) T q+1 δ +1 δ q ) = y+1 δ +1 δ y ) T H y+1 δ +1 δ y ). 3.6)

9 3. A general Krylov method 9 Since q +1 is orthogonal to y +1 and y, and since q is orthogonal to y, 3.6) becomes δ +1 y+1 T δ q = y +1 δ +1 ) T y H y+1 δ +1 ) y. 3.7) δ δ Hence, by Proposition 2.1 and since q is orthogonal to y and y 1, 3.7) may be written as δ +1 α q T δ q = y +1 δ +1 ) T y H y+1 δ +1 ) y, δ δ hence 3.4) is obtained. The following lemma gives some results on the behavior of the sequence of {δ } in connection to the sign of α for the case when H. Lemma 3.2. Let q, y, δ ), =,..., r, be given by Proposition 3.1. Assume that H. Then δ for < r. If δ > and δ +1, then δ +1 > if and only if α >. Proof. Assume that δ = for < r, then q = Hy, hence pre-multiplication with y T yields = yt q = y T Hy, since q is orthogonal to y. Then, since H, it follows that Hy = and hence q =. Since q for < r, the assumption yields a contradiction. Hence, δ, < r. Next suppose that δ > and δ +1. Since H, Lemma 3.1 gives α δ +1 δ q T q, 3.8) which implies that δ +1 and δ have the same sign if and only if α >. Hence, if δ >, then δ +1 > if and only if α >. The result of Lemma 3.2 motivates our choice of the minus-sign in 2.4). Otherwise δ would alternate sign in each iteration for α > and H. Note that for H and δ >, then δ i >, i =,..., r 1, and δ r. Further, if H then the inequality in 3.8) is strict implying that δ i >, i = 1,..., r, i.e., δ r since 1.1) with H is always compatible. Another consequence is that if α is chosen positive for =,..., r, then δ for some implies H and δ < for some implies H A general Krylov algorithm Based on our results we now state an algorithm for solving 1.1) based on the triples q, y, δ ), =,..., r, as in Proposition 3.1 using some α of our choice. We call Algorithm 3.1 a Krylov algorithm as it is the vectors {q }, spanning the Krylovsubspaces, that drive the progress of the algorithm and ensures its termination. The choice of a nonzero α is in theory arbitrary, but for the algorithm stated below we have made the choice to let α > such that y +1 2 = c 2, i.e., c 2 α = q + qt Hq q T q y, α = 2 q + qt Hq q T q c 2 y + qt 1 Hq, = 1,..., r. q 1 T q y 1 2 1

10 1 A general Krylov method for solving symmetric systems of linear equations This choice is well defined since y, = 1,..., r, by Proposition 3.1. In theory, triples are generated as long as q. In the algorithm, we introduce a tolerance such that we proceed with the iterations as long as q 2 > q tol, where we let q tol = ɛ M, where ɛ M is the machine precision. In theory we also draw conclusions based on δ r or δ r =, for this we introduce a tolerance δ tol = ɛ M. Algorithm 3.1 A general Krylov algorithm Input arguments: H, c; Output arguments: compatible; x r if compatible=1; y r if compatible=; q tol tolerance on q 2 ; [Our choice: q tol = ɛ M ] δ tol tolerance on δ ; [Our choice: δ tol = ɛ M ] ; q c; y ; δ 1; y 1 q + qt Hq q T q y ) ; δ1 q T Hq q T q δ ) ; q1 Hy 1 + δ 1 c; α nonzero scalar; [Our choice: α = c 2 / y 1 2 ] q 1 α q 1 ; y 1 α y 1 ; δ 1 α δ 1 ; 1; while q 2 > q tol do y +1 q + qt Hq q T q y + qt 1 Hq ) q 1 T q y 1 ; 1 δ+1 q T Hq q T q δ + qt 1 Hq ) q 1 T q δ 1 ; 1 q +1 Hy +1 + δ +1 c; α nonzero scalar; [Our choice: α = c 2 / y +1 2 ] q +1 α q +1 ; y +1 α y +1 ; δ +1 α δ +1 ; + 1; end while r ; if δ r > δ tol then x r 1 δ r y r ; compatible 1; else compatible ; end if By Theorem 3.1, Algorithm 3.1 will return either a solution to 1.1) or a certificate that the system is incompatible. Note that δ = for some does not stop Algorithm 3.1. The following example illustrates Algorithm 3.1 with our choices for α >, q tol and δ tol, on a case of 1.1) where δ = every other iteration. The example also illustrates the change of sign between δ +1 and δ 1 when δ = predicted by Corollary 3.1. Example 3.1. Let c = ) T, H = diagc), Algorithm 3.1 applied to H and c with α > such that y = c, = 1,..., r, and q tol = δ tol = ɛ M, yields the following sequences

11 4. Connection to the method of conjugate gradients 11 q = y = delta = Hence, r = 6 and x r = 1/δ r )y r = ) T. 4. Connection to the method of conjugate gradients Next we will put the method of conjugate gradients by Hestenes and Stiefel [9] into the framewor of our results. For an introduction to the method of conjugate gradients see, e.g., [12, 19]. This method is defined for the case where H. The iterate x is then defined as the solution to min K c,h) 1 2 xt Hx + c T x, and g = Hx + c for =,..., r, i.e., g K +1 c, H) K c, H). In our setting, this is equivalent to generating triples q, y, δ ), =,..., r, given by Proposition 3.1, selecting the scaling α such that δ = 1. If it is assumed that H, Lemma 3.2 gives δ, =,..., r 1. If c RH), i.e., 1.1) is compatible, then Theorem 3.1 ensures that δ r. Hence, if H and c RH), such a scaling is well defined. The following proposition gives the particular choices of scaling α for which q, y, δ ) may be denoted by g, x, 1), =,..., r. Proposition 4.1. Assume that H and c RH). If q, y, δ ) are given by Proposition 3.1 for the particular choices α = 1 q T q δ q T Hq, α = then α >, =,..., r 1, and δ = 1, =,..., r. 1, = 1,..., r 1, 4.1) q T Hq q T q δ + qt 1 Hq q 1 T q δ 1 1

12 12 A general Krylov method for solving symmetric systems of linear equations Hence, q, y, δ ) may be denoted by g, x, 1), =,..., r, and it holds that α = gt g g T Hg, x 1 = x + α g ), g 1 = Hx 1 + c, 4.2a) 1 α = g T Hg g T g + gt 1 Hg g T 1 g 1, = 1,..., r 1, 4.2b) x +1 = x + α g gt 1 Hg g 1 T g x x 1 ) ), = 1,..., r 1, 4.2c) 1 g +1 = Hx +1 + c, = 1,..., r d) Proof. By assumption δ = 1 and by inserting α, =,..., r 1, as in 4.1) in the expressions for δ +1, = 1,..., r 1, from Proposition 3.1, it holds that δ = 1, =,..., r. Hence, q, y, δ ) may be denoted by g, x, 1), =,..., r. If g, x, 1) is introduced for q, y, δ ) in 4.1) then the expressions for α, =,..., r 1, in 4.2a) and 4.2b) are obtained. By these expressions, x, =,..., r, are given by x =, and x 1 = α g + gt Hg ) g T g x = x + α g ), x +1 = α g + gt Hg g T g x + gt 1 Hg ) g 1 T g x 1 = 1 = x + α g gt 1 Hg g 1 T g x x 1 ) ), = 1,..., r 1, 1 and g, =,..., r, are given by g = c and g +1 = Hx +1 + c, =,..., r 1. By Lemma 3.2, and since δ = 1 >, =,..., r, it holds that α >, =,..., r 1. It is evident that the method of conjugate gradients relies heavily on the fact that g, x, 1) is well defined for all =,..., r, i.e., that δ for all =,..., r. In effect, the method of conjugate gradients implicitly maes the very specific choice of α, =,..., r 1, as given in Proposition 4.1. If the assumption H and c RH) is lifted then a high price is paid for using g, x, 1), the algorithm breas down if δ = for some. By our analysis, this choice of scaling is unnecessary, as illustrated by Theorem 3.1 and by Algorithm 3.1. In particular, if the method of conjugate gradients is applied to a case where H and c RH), the algorithm will brea down. From Lemma 3.1, it is clear that this will happen at the very last iteration. Algorithm 3.1 would instead provide y r as a certificate of incompatibility in this situation. The method of conjugate gradients may be regarded as a line-search method on an unconstrained quadratic program, see, e.g. [19], where search-directions p and an optimal step-lengths along these are calculated along with the x and g, =

13 5. Connection to the minimum-residual method 13,..., r. In the following corollary we show that this formulation can be obtained from Proposition 4.1. Corollary 4.1. Assume that H and c RH). Let g, x, 1), =,..., r, be given by Proposition 4.1. Then, where x +1 = x + α p, =,..., r 1, g +1 = g + α Hp, =,..., r 1, p = g, p = g + gt g g 1 T g p 1, = 1,..., r 1, 1 α = gt p p T Hp, =,..., r 1. Proof. Let p = 1/α )x +1 x ), =,..., r 1. Proposition 4.1 then gives p = g, and p = g gt 1 Hg g 1 T g x x 1 ), = 1,..., r 1. 1 By Proposition 2.1 it holds that g T g = α 1 g T Hg 1, hence p = g gt 1 Hg g 1 T g α 1 p 1 = g + 1 gt g g 1 T g p 1, = 1,..., r 1. 1 Consequently, since g +1 g = Hx +1 x ) = α Hp, =,..., r 1, it holds that g +1 = g + α Hp, =,..., r 1. Further, g T +1 p = since g +1 K +1 c, H) K c, H) and p K c, H). Hence, = g T +1 p = g T p + α p T Hp yields completing the proof. α = gt p p T Hp, =,..., r 1, Note that the scaling, given in Proposition 4.1, of the triple in our formulation is regarded as the scaling of a search direction in the line-search formulation of the method of conjugate gradients. From the point of view of our results it is not necessary to calculate search directions, Proposition 4.1 gives a complete description of the method of conjugate gradients. 5. Connection to the minimum-residual method One issue that remains to be addressed is the case when 1.1) is incompatible. Algorithm 3.1 then gives a certificate of this fact, but often one would be interested in a vector x that is as good as possible. The method of choice could then be the minimum residual method. This method goes bac to [11] and [2], and the name is

14 14 A general Krylov method for solving symmetric systems of linear equations adopted form the implementation of the method, MINRES, by Paige and Saunders, see [16]. For =,..., r, x MR is defined as a solution to min x K c,h) Hx + c 2 2, and the corresponding residual g MR is defined as g MR = Hx MR + c. The vectors x MR are uniquely defined for =,..., r 1, and for = r if c RH). For the case = r and c RH) there is one degree of freedom for x MR r. If c RH), then x MR r solves 1.1), and if c RH), then x MR r 1 and xmr r are both solutions to min x IR n Hx + c 2 2. In the following theorem, we derive the minimum-residual method based on our framewor. In particular, we give explicit formulas for x MR and g MR, =,..., r. For the case = r, c RH), we give an explicit formula for x MR r of minimum Euclidean norm. Theorem 5.1. Let q, y, δ ) be given by Proposition 3.1 for =,..., r. Then, for =,..., r, it holds that x MR solves min x K c,h) Hx + c 2 2 if and only if = γ iy i for some γ i, i =,...,, that are optimal to x MR 1 minimize γ,...,γ 2 subject to γi 2 qi T q i γ iδ i = ) In particular, x MR taes the following form for the mutually exclusive cases a) < r; b) = r and δ r ; and c) = r and δ r =. a) For < r, it holds that x MR = 1 δj 2 j= qj T q j δ i qi T q y i, 5.2) i and g MR = Hx MR + c. b) For = r and δ r, it holds that x MR r = 1/δ r )y r and gr MR = Hx MR r +c =, so that x MR r solves 1.1) and x MR r is identical to x r of Theorem 3.1. c) For = r and δ r =, it holds that x MR r = x MR r 1 +γ ry r, where γ r is an arbitrary scalar, and gr MR = Hx MR r + c = gr 1 MR. In addition, xmr r 1 and xmr r solve min x IR n Hx + c 2 2. The particular choice γ r = yt rx MR r 1 y T ry r maes x MR r norm. an optimal solution to min x IR n Hx + c 2 2 of minimum Euclidean

15 5. Connection to the minimum-residual method 15 Proof. Since q i, i =,...,, form an orthogonal basis for K +1 c, H), an arbitrary vector in K +1 c, H) can be written as g = γ i q i = γ i Hy i + δ i c) = H ) ) γ i y i + γ i δ i c, 5.3) for some parameters γ i, i =,...,. Consequently, the condition γ iδ i = 1 inserted into 5.3) gives g = Hx + c with x = γ iy i, i.e., g is an arbitrary vector in K +1 c, H) for which the coefficient in front of c equals one, and x is the corresponding arbitrary vector in K c, H). Minimizing the Euclidean norm of such a g gives the quadratic program 1 minimize g,γ,...,γ 2 gt g subject to g = γ iq i, γ iδ i = 1, so that, by 5.3), the optimal values of γ i, i =,...,, give g MR and x MR as x MR = 5.4) as g MR = γ iq i γ iy i. Elimination of g in 5.4), taing into account the orthogonality of the q i s, gives the equivalent problem 5.1). Also note that since δ, the quadratic programs 5.1) and 5.4) are always feasible, and hence they are well defined. Let Lγ, λ) be the Lagrangian function for 5.1), Lγ, λ) = 1 2 γi 2 qi T q i λ γ i δ i 1 ). The optimality conditions for 5.1) are given by = γi Lγ, λ) = γ i qi T q i λδ i, i =,...,, 5.5a) = λ Lγ, λ) = 1 γ i δ i. 5.5b) First, for a), consider the case < r. From 5.5a) it holds that γ i = λ δ i qi T q, i =,...,, 5.6) i which are well defined, since q i, i =,..., r 1. The expression for λ is obtained by inserting the expression for γ i, i =,...,, given by 5.6) in 5.5b) so that 1 = γ i δ i = λ δ2 i q T i q i yielding λ = Consequently, a combination of 5.6) and 5.7) gives γ i = 1 δj 2 j= qj T q j δ i 1 δi ) qi T q i qi T q, i =,...,. 5.8) i

16 16 A general Krylov method for solving symmetric systems of linear equations Hence, by letting x MR = γ iy i, with γ i given by 5.8), 5.2) follows. Now, for b), consider the case = r with δ r. Then, since q r =, 5.5a) gives λ = and γ i =, i =,..., r 1. Consequently, 5.5b) gives γ r = 1/δ r. Again, by letting x MR r = r γ iy i, it holds that x MR = 1/δ r )y r, for which g MR = Hx MR + c =, so that the optimal value is zero in 5.1) and x MR r solves 1.1). Finally, for c), consider the case = r with δ r =. Theorem 3.1 shows that there exists an x 1) r K 1 c, H) that solves min x IR n Hx + c 2 2. Consequently, since x 1) r K 1 c, H), it follows from a) that it must hold that x 1) r = x MR r 1 so that x MR r 1 solves min x IR n Hx + c 2 2. For = r, the optimality conditions 5.5) are equivalent to when = r 1, just with the additional information that γ r is arbitrary. Hence, x MR r = x MR r 1 + γ ry r and gr MR = Hx MR r + c = gr 1 MR for γ r arbitrary since Hy r =. For the remainder of the proof, let x MR r = x MR r 1 + γ ry r for the particular choice γ r = yr T x MR r 1 )/yt r y r ), so that yr T x MR r =. Since y = Hy 1) + δ 1) c, it follows that Z T y 1) = δ 1) ZT c, =,..., r. Consequently, since x MR r is formed as a linear combination of y, =,..., r, it holds that Z T x MR r 1 is parallel to ZT c. But = yr T x MR r = c T ZZ T x MR =, so that Z T x MR is also orthogonal to Z T c. By Theorem 3.1, Z T c, so that Z T x MR =. Hence, x MR r belongs to the range space of H for this choice of γ r. As x MR r = x MR r 1 + γ ry r and y r belongs to the null space of H, it follows that the range-space component of x MR r is unique. Hence, since the range-space component is unique and the null-space component is zero, the conclusion is that x MR r is an optimal solution to min x IR n Hx+c 2 2 of minimum Euclidean norm. Note that at an iteration at which δ = and q, it holds that x MR = x MR 1 so that the iterate is unchanged. By Proposition 3.2 this cannot happen at two consecutive iterations. If q, all information from the problem has not been extracted even if δ =. Only in the case when = r, global information is obtained, and it is determined whether 1.1) has a solution. In the following corollary of Theorem 5.1 we may state explicit recursions for the minimum-residual method. Corollary 5.1. Let q, y, δ ) be given by Proposition 3.1 and let x MR be given by Theorem 5.1 for =,..., r. If δ MR = δ 2, ymr = δ y, δ MR then = q T q 1 δ 2 i q T i q i x MR = 1 δ MR In addition, it holds that δ MR + δ 2 and y MR = q T q 1 δ i qi T q y i + δ y, = 1,..., r, i y MR, =,..., r 1 and = r if δ r. +1 = qt +1 q +1 q T q δ MR + δ+1 2 and y MR for =,..., r = qt +1 q +1 q T q y MR + δ +1 y +1,

17 5. Connection to the minimum-residual method 17 Proof. For =,..., r 1, the expressions for δ MR and y MR give 1 q T q δ MR = δ 2 i q T i q i and 1 q T q y MR = δ i qi T q y i. i Note that δ and q i, i =,..., implies δ MR > for < r. Hence, Theorem 5.1 gives x MR = 1/δ MR )y MR. If = r and δ r, the expressions for δr MR and yr MR give δr MR = δr 2 > and yr MR = δ r y r, so that Theorem 5.1 gives x MR r = 1/δr MR )yr MR also for this case. To see the recursions, note that for =,..., r 1 it follows that and δ MR +1 = qt +1 q +1 y MR +1 = qt +1 q +1 = qt +1 q +1 q T q as required. = qt +1 q +1 q T q δi 2 q T i q + δ+1 2 i q T q 1 δ 2 i qi T q + δ 2 i δ i q T i q y i + δ +1 y +1 i ) 1 q T q δ i qi T q y i + δ y i The recursion for x MR r in Theorem 5.1 and it is not reiterated in Corollary 5.1. ) + δ+1 2 = qt +1 q +1 q T q δ MR + δ+1 2, + δ +1 y +1 = qt +1 q +1 q T q y MR + δ +1 y +1,, based on x MR r 1 and q r, y r, δ r ), for the case δ r = is given Note that the expressions in Theorem 5.1 and Corollary 5.1 for x MR, =,..., r, are independent of the scaling of q, y, δ ). Hence, the scaling δ = 1 may be used for the case H, c RH), as stated in the following corollary. Corollary 5.2. Assume that H and c RH). Let g, x, 1), =,..., r, be given by Proposition 4.1. Then, for each =,..., r, q, y, δ ) may be replaced by g, x, 1) in Theorem 5.1 and Corollary 5.1. In effect, if H and c RH), then x MR, =,..., r 1, is given as a convex combination of x i, i =,..., A mimimum-residual algorithm based on the general Krylov method Next we state an algorithm for the minimum-residual method based on Algorithm 3.1 and extended with the recursion in Corollary 5.1. We used the same α > and tolerances q tol and δ tol as for Algorithm 3.1.

18 18 A general Krylov method for solving symmetric systems of linear equations Hence, for a compatible system 1.1) Algorithm 5.2 gives the same solution x r as Algorithm 3.1, and in addition it calculates x MR r. They are both estimates of a solution to 1.1). Further, if 1.1) is incompatible then Algorithm 5.2 delivers x MR r, an optimal solution to min x IR n Hx + c 2 2 of minimum Euclidean norm, in addition to the certificate of incompatibility. Algorithm 5.2 A mimimum-residual algorithm based on the general Krylov method Input arguments: H, c; Output arguments: x MR r, g MR r, compatible; x r or y r if compatible=1 or ;) q tol tolerance on q 2 ; [Our choice: q tol = ɛ M ] δ tol tolerance on δ ; [Our choice: δ tol = ɛ M ] ; q c; y ; δ 1; y MR δ y ; δ MR δ 2; xmr 1 y MR ; g MR Hx MR + c; δ MR y 1 q + qt Hq q T q y ) ; δ1 q T Hq q T q δ ) ; q1 Hy 1 + δ 1 c; α nonzero scalar; [Our choice: α = c 2 / y 1 2 ] q 1 α q 1 ; y 1 α y 1 ; δ 1 α δ 1 ; if q 1 2 > q tol or δ 1 > δ tol then y1 MR qt 1 q 1 q T q ymr + δ 1 y 1 ; δ1 MR qt 1 q 1 q T q δmr + δ1 2; x MR 1 1 δ1 MR end if 1; while q 2 > q tol do y MR 1 ; g MR 1 Hx MR 1 + c; y +1 q + qt Hq q T q y + qt 1 Hq ) q 1 T q y 1 ; 1 δ+1 q T Hq q T q δ + qt 1 Hq ) q 1 T q δ 1 ; 1 q +1 Hy +1 + δ +1 c; α nonzero scalar; [Our choice: α = c 2 / y +1 2 ] q +1 α q +1 ; y +1 α y +1 ; δ +1 α δ +1 ; if q +1 2 > q tol or δ +1 > δ tol then y+1 MR qt +1 q +1 q T q y MR + δ +1 y +1 ; δ MR y+1 MR; gmr +1 HxMR +1 + c; x MR +1 1 δ+1 MR end if + 1; end while r ; if δ r > δ tol then x r 1 δ r y r ; compatible 1; else x MR r end if x MR r 1 yt r x MR r 1 y T r yr y r; compatible ; +1 qt +1 q +1 q T q δ MR + δ 2 +1 ; Revisiting Example 3.1 and applying the extended Algorithm 5.2 to the compatible system of 1.1), then in addition the identical {δ }, {q } and {y }see Example

19 5. Connection to the minimum-residual method ), Algorithm 5.2 also returns xmr = We state {δ } again for convenience, delta = and note that for δ =, it holds that x MR = x MR 1, as predicted by Theorem 5.1. Next we observe an example that illustrates Algorithm 5.2 with our choices for α >, q tol and δ tol, on a case when 1.1) is incompatible, i.e. c / RH). Example 5.1. Let c = ) T, H = diag ), Algorithm 5.2 applied to H and c with α > such that y = c, = 1,..., r, and q tol = δ tol = ɛ M, yields the following sequences q = y = delta =

20 2 A general Krylov method for solving symmetric systems of linear equations xmr = Hence, r = 7 and x MR r = ) T. Note that since δ r < δ tol, the system is considered incompatible and x MR r is the optimal solution to min x IR n Hx + c 2 2 of minimum Euclidean norm and HxMR r + c 2 2 = Summary and conclusion We have presented a general Krylov method for solving a system of linear equations Hx + c for H symmetric. No assumption on H except symmetry is made. In exact arithmetic, we determine if the system is compatible. If so, a solution is computed. If not, a certificate of incompatibility is computed. The basis for our method are the triples q, y, δ ), =,..., r, given by Proposition 3.1, that are uniquely defined up to a scaling. We have obtained the method of conjugate gradients as a special case of our method by giving the scaling for which δ = 1, =,..., r. We have also put the minimum-residual method in our framewor and provided explicit formulas for the iterations. Again, we base our analysis on the triples. In the case of an incompatible system, our method gives x MR r of minimum Euclidean norm. The original implementation of MINRES [16] did not deliver the optimal solution to min x IR n Hx + c 2 2 of minimum Euclidean norm. In [2] a MINRES-lie algorithm, MINRES-QLP is developed that does. One may observe that an alternative to using the minimum-residual iterations would be to consider recursions for y 1) and δ 1) as given by 3.2) and then calculate = 1/δ1), according to the analysis in the proof of Theorems 3.1 and x MR r 1 r )y r 1) 5.1. However, such an approach would not automatically yield the estimate x MR, =,..., r 2. One could also note that the method of conjugate gradients may be viewed as trying to solve the miminum-residual problem 5.1) in the situation where only the present triple q, y, δ ) is allowed in the linear combination, i.e., γ i =, i =,..., 1. The problem is then not necessarily feasible. It will be infeasible exactly when δ =. One could thin of methods other than the minimum-residual method which use a linear combination of more than one triple. By Proposition 3.2, it would suffice to use two consecutive triples, since it cannot hold that δ 1 = and δ =. The method SYMMLQ [16] uses this property, but we have not investigated the precise relationship to an optimization problem on the same form as 5.1). Finally, we want to stress that this paper is meant to give insight into Krylov methods, but we do not claim any numerical properties. The general Krylov method would inherit deficiencies of any method based on the Lanczos process such as loss of orthogonality of the generated vectors. It is beyond the scope of the present paper

21 A. A result on the Krylov vectors 21 to mae such an analysis, see, e.g., [8,13,14,17]. The theory of our paper is based on determining if certain quantities are zero or not. In our algorithms, we have made choices on optimality tolerances that are not meant to be universal. To obtain a fully functioning algorithm, the issue of determining if a quantity is near-zero would need to be considered more in detail. Also, we have based our analysis on the triples, so that termination of Algorithm 5.2 is based on q and δ. In practice, one should probably also consider g MR. Further, the use of pre-conditioning is not explored in this paper, for this subject see, e.g., [1, 3, 4]. A. A result on the Krylov vectors For completeness, we review a result that characterizes the properties of q expressed as in 2.1), needed for the analysis. Lemma A.1. Let r denote the smallest positive integer for which K +1 c, H) K c, H) = {}. For an index such that 1 r, let q K +1 c, H) K c, H), be expressed as in 2.1). Then, the scalars δ j), j =,...,, are uniquely determined up to a nonzero scaling. In addition, if δ ) it holds that q = if and only if K +1 c, H) K c, H) = {}, i.e., if = r. Proof. Assume that q K +1 c, H) K c, H) is expressed as in 2.1). If < r, then c, Hc, H 2 c,..., H c are linearly independent. Hence, δ j), j =,...,, are uniquely determined by q. Consequently, as q is uniquely defined up to a nonzero scaling, then so are δ j), j =,...,. For = r, we have q r = so that j= r 1 δ r r) H r c = δ r j) H j c. A.1) By the definition of r, it holds that c, Hc, H 2 c,..., H r 1 c are linearly independent. Hence, A.1) shows that a fixed δ r r) uniquely determines δ r j), j = 1,..., r 1. Consequently, a scaling of δ r r) gives a corresponding scaling of δ r j), j =,..., r 1. Thus, δ r j), j =,..., r are uniquely determined up to a common scaling. Finally, assume that δ ). By definition K +1 c, H) K c, H) = {} implies q =. To show the converse, assume that q =. Then, δ ) 1 H c = δ j) Hj c. A.2) If δ ), then A.2) implies H c span{c, Hc, H 2 c,..., H 1 c}, i.e., K +1 c, H) = K c, H), so that K +1 c, H) K c, H) = {}, completing the proof. j=

22 22 References References [1] Axelsson, O. On the efficiency of a class of A-stable methods. Nordis Tidsr. Informationsbehandling BIT) ), [2] Choi, S.-C. T., Paige, C. C., and Saunders, M. A. MINRES-QLP: a Krylov subspace method for indefinite or singular symmetric systems. SIAM J. Sci. Comput. 33, 4 211), [3] Evans, D. J. The use of pre-conditioning in iterative methods for solving linear equations with symmetric positive definite matrices. J. Inst. Math. Appl ), [4] Evans, D. J. The analysis and application of sparse matrix algorithms in the finite element method [5] Forsgren, A., Gill, P. E., and Shinnerl, J. R. Stability of symmetric ill-conditioned systems arising in interior methods for constrained optimization. SIAM J. Matrix Anal. Appl ), [6] Golub, G. H., and O Leary, D. P. Some history of the conjugate gradient and Lanczos algorithms: SIAM Rev. 31, ), [7] Golub, G. H., and Van Loan, C. F. Matrix computations, vol. 3 of Johns Hopins Series in the Mathematical Sciences. Johns Hopins University Press, Baltimore, MD, [8] Hane, M. Conjugate gradient type methods for ill-posed problems, vol. 327 of Pitman Research Notes in Mathematics Series. Longman Scientific & Technical, Harlow, [9] Hestenes, M. R., and Stiefel, E. Methods of conjugate gradients for solving linear systems. J. Research Nat. Bur. Standards ), ). [1] Lanczos, C. An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. J. Research Nat. Bur. Standards ), [11] Lanczos, C. Solution of systems of linear equations by minimized-iterations. J. Research Nat. Bur. Standards ), [12] Luenberger, D. G. Linear and nonlinear programming, second ed. Addison-Wesley Pub Co, Boston, MA, [13] Meurant, G., and Straoš, Z. The Lanczos and conjugate gradient algorithms in finite precision arithmetic. Acta Numer ), [14] Paige, C. C. Error analysis of the Lanczos algorithm for tridiagonalizing a symmetric matrix. J. Inst. Math. Appl. 18, ), [15] Paige, C. C. Krylov subspace processes, Krylov subspace methods, and iteration polynomials. In Proceedings of the Cornelius Lanczos International Centenary Conference Raleigh, NC, 1993) Philadelphia, PA, 1994), SIAM, pp [16] Paige, C. C., and Saunders, M. A. Solutions of sparse indefinite systems of linear equations. SIAM J. Numer. Anal. 12, ), [17] Reid, J. K. On the method of conjugate gradients for the solution of large sparse systems of linear equations. In Large sparse sets of linear equations Proc. Conf., St. Catherine s Coll., Oxford, 197). Academic Press, London, 1971, pp [18] Saad, Y. Iterative methods for sparse linear systems, second ed. Society for Industrial and Applied Mathematics, Philadelphia, PA, 23. [19] Shewchu, J. R. An introduction to the conjugate gradient method without the agonizing pain. Tech. rep., Pittsburgh, PA, USA, [2] Stiefel, E. Relaxationsmethoden bester Strategie zur Lösung linearer Gleichungssysteme. Comment. Math. Helv ),

ON THE CONNECTION BETWEEN THE CONJUGATE GRADIENT METHOD AND QUASI-NEWTON METHODS ON QUADRATIC PROBLEMS

ON THE CONNECTION BETWEEN THE CONJUGATE GRADIENT METHOD AND QUASI-NEWTON METHODS ON QUADRATIC PROBLEMS ON THE CONNECTION BETWEEN THE CONJUGATE GRADIENT METHOD AND QUASI-NEWTON METHODS ON QUADRATIC PROBLEMS Anders FORSGREN Tove ODLAND Technical Report TRITA-MAT-203-OS-03 Department of Mathematics KTH Royal

More information

Key words. conjugate gradients, normwise backward error, incremental norm estimation.

Key words. conjugate gradients, normwise backward error, incremental norm estimation. Proceedings of ALGORITMY 2016 pp. 323 332 ON ERROR ESTIMATION IN THE CONJUGATE GRADIENT METHOD: NORMWISE BACKWARD ERROR PETR TICHÝ Abstract. Using an idea of Duff and Vömel [BIT, 42 (2002), pp. 300 322

More information

The Lanczos and conjugate gradient algorithms

The Lanczos and conjugate gradient algorithms The Lanczos and conjugate gradient algorithms Gérard MEURANT October, 2008 1 The Lanczos algorithm 2 The Lanczos algorithm in finite precision 3 The nonsymmetric Lanczos algorithm 4 The Golub Kahan bidiagonalization

More information

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH

ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH ON ORTHOGONAL REDUCTION TO HESSENBERG FORM WITH SMALL BANDWIDTH V. FABER, J. LIESEN, AND P. TICHÝ Abstract. Numerous algorithms in numerical linear algebra are based on the reduction of a given matrix

More information

RESIDUAL SMOOTHING AND PEAK/PLATEAU BEHAVIOR IN KRYLOV SUBSPACE METHODS

RESIDUAL SMOOTHING AND PEAK/PLATEAU BEHAVIOR IN KRYLOV SUBSPACE METHODS RESIDUAL SMOOTHING AND PEAK/PLATEAU BEHAVIOR IN KRYLOV SUBSPACE METHODS HOMER F. WALKER Abstract. Recent results on residual smoothing are reviewed, and it is observed that certain of these are equivalent

More information

Contribution of Wo¹niakowski, Strako²,... The conjugate gradient method in nite precision computa

Contribution of Wo¹niakowski, Strako²,... The conjugate gradient method in nite precision computa Contribution of Wo¹niakowski, Strako²,... The conjugate gradient method in nite precision computations ªaw University of Technology Institute of Mathematics and Computer Science Warsaw, October 7, 2006

More information

A SUFFICIENTLY EXACT INEXACT NEWTON STEP BASED ON REUSING MATRIX INFORMATION

A SUFFICIENTLY EXACT INEXACT NEWTON STEP BASED ON REUSING MATRIX INFORMATION A SUFFICIENTLY EXACT INEXACT NEWTON STEP BASED ON REUSING MATRIX INFORMATION Anders FORSGREN Technical Report TRITA-MAT-2009-OS7 Department of Mathematics Royal Institute of Technology November 2009 Abstract

More information

AN ITERATIVE METHOD WITH ERROR ESTIMATORS

AN ITERATIVE METHOD WITH ERROR ESTIMATORS AN ITERATIVE METHOD WITH ERROR ESTIMATORS D. CALVETTI, S. MORIGI, L. REICHEL, AND F. SGALLARI Abstract. Iterative methods for the solution of linear systems of equations produce a sequence of approximate

More information

Chapter 3 Transformations

Chapter 3 Transformations Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

Lecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm

Lecture 9: Krylov Subspace Methods. 2 Derivation of the Conjugate Gradient Algorithm CS 622 Data-Sparse Matrix Computations September 19, 217 Lecture 9: Krylov Subspace Methods Lecturer: Anil Damle Scribes: David Eriksson, Marc Aurele Gilles, Ariah Klages-Mundt, Sophia Novitzky 1 Introduction

More information

1 Conjugate gradients

1 Conjugate gradients Notes for 2016-11-18 1 Conjugate gradients We now turn to the method of conjugate gradients (CG), perhaps the best known of the Krylov subspace solvers. The CG iteration can be characterized as the iteration

More information

Iterative Methods for Solving A x = b

Iterative Methods for Solving A x = b Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http

More information

On fast trust region methods for quadratic models with linear constraints. M.J.D. Powell

On fast trust region methods for quadratic models with linear constraints. M.J.D. Powell DAMTP 2014/NA02 On fast trust region methods for quadratic models with linear constraints M.J.D. Powell Abstract: Quadratic models Q k (x), x R n, of the objective function F (x), x R n, are used by many

More information

Notes on Some Methods for Solving Linear Systems

Notes on Some Methods for Solving Linear Systems Notes on Some Methods for Solving Linear Systems Dianne P. O Leary, 1983 and 1999 and 2007 September 25, 2007 When the matrix A is symmetric and positive definite, we have a whole new class of algorithms

More information

Reduced Synchronization Overhead on. December 3, Abstract. The standard formulation of the conjugate gradient algorithm involves

Reduced Synchronization Overhead on. December 3, Abstract. The standard formulation of the conjugate gradient algorithm involves Lapack Working Note 56 Conjugate Gradient Algorithms with Reduced Synchronization Overhead on Distributed Memory Multiprocessors E. F. D'Azevedo y, V.L. Eijkhout z, C. H. Romine y December 3, 1999 Abstract

More information

Tikhonov Regularization of Large Symmetric Problems

Tikhonov Regularization of Large Symmetric Problems NUMERICAL LINEAR ALGEBRA WITH APPLICATIONS Numer. Linear Algebra Appl. 2000; 00:1 11 [Version: 2000/03/22 v1.0] Tihonov Regularization of Large Symmetric Problems D. Calvetti 1, L. Reichel 2 and A. Shuibi

More information

6.4 Krylov Subspaces and Conjugate Gradients

6.4 Krylov Subspaces and Conjugate Gradients 6.4 Krylov Subspaces and Conjugate Gradients Our original equation is Ax = b. The preconditioned equation is P Ax = P b. When we write P, we never intend that an inverse will be explicitly computed. P

More information

POSITIVE SEMIDEFINITE INTERVALS FOR MATRIX PENCILS

POSITIVE SEMIDEFINITE INTERVALS FOR MATRIX PENCILS POSITIVE SEMIDEFINITE INTERVALS FOR MATRIX PENCILS RICHARD J. CARON, HUIMING SONG, AND TIM TRAYNOR Abstract. Let A and E be real symmetric matrices. In this paper we are concerned with the determination

More information

Chapter 10 Conjugate Direction Methods

Chapter 10 Conjugate Direction Methods Chapter 10 Conjugate Direction Methods An Introduction to Optimization Spring, 2012 1 Wei-Ta Chu 2012/4/13 Introduction Conjugate direction methods can be viewed as being intermediate between the method

More information

CG VERSUS MINRES: AN EMPIRICAL COMPARISON

CG VERSUS MINRES: AN EMPIRICAL COMPARISON VERSUS : AN EMPIRICAL COMPARISON DAVID CHIN-LUNG FONG AND MICHAEL SAUNDERS Abstract. For iterative solution of symmetric systems Ax = b, the conjugate gradient method () is commonly used when A is positive

More information

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES IJMMS 25:6 2001) 397 409 PII. S0161171201002290 http://ijmms.hindawi.com Hindawi Publishing Corp. A PROJECTED HESSIAN GAUSS-NEWTON ALGORITHM FOR SOLVING SYSTEMS OF NONLINEAR EQUATIONS AND INEQUALITIES

More information

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL) Part 3: Trust-region methods for unconstrained optimization Nick Gould (RAL) minimize x IR n f(x) MSc course on nonlinear optimization UNCONSTRAINED MINIMIZATION minimize x IR n f(x) where the objective

More information

Block Bidiagonal Decomposition and Least Squares Problems

Block Bidiagonal Decomposition and Least Squares Problems Block Bidiagonal Decomposition and Least Squares Problems Åke Björck Department of Mathematics Linköping University Perspectives in Numerical Analysis, Helsinki, May 27 29, 2008 Outline Bidiagonal Decomposition

More information

EECS 275 Matrix Computation

EECS 275 Matrix Computation EECS 275 Matrix Computation Ming-Hsuan Yang Electrical Engineering and Computer Science University of California at Merced Merced, CA 95344 http://faculty.ucmerced.edu/mhyang Lecture 20 1 / 20 Overview

More information

Some minimization problems

Some minimization problems Week 13: Wednesday, Nov 14 Some minimization problems Last time, we sketched the following two-step strategy for approximating the solution to linear systems via Krylov subspaces: 1. Build a sequence of

More information

THE NEWTON BRACKETING METHOD FOR THE MINIMIZATION OF CONVEX FUNCTIONS SUBJECT TO AFFINE CONSTRAINTS

THE NEWTON BRACKETING METHOD FOR THE MINIMIZATION OF CONVEX FUNCTIONS SUBJECT TO AFFINE CONSTRAINTS THE NEWTON BRACKETING METHOD FOR THE MINIMIZATION OF CONVEX FUNCTIONS SUBJECT TO AFFINE CONSTRAINTS ADI BEN-ISRAEL AND YURI LEVIN Abstract. The Newton Bracketing method [9] for the minimization of convex

More information

ANY FINITE CONVERGENCE CURVE IS POSSIBLE IN THE INITIAL ITERATIONS OF RESTARTED FOM

ANY FINITE CONVERGENCE CURVE IS POSSIBLE IN THE INITIAL ITERATIONS OF RESTARTED FOM Electronic Transactions on Numerical Analysis. Volume 45, pp. 133 145, 2016. Copyright c 2016,. ISSN 1068 9613. ETNA ANY FINITE CONVERGENCE CURVE IS POSSIBLE IN THE INITIAL ITERATIONS OF RESTARTED FOM

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 18 Outline

More information

MS&E 318 (CME 338) Large-Scale Numerical Optimization

MS&E 318 (CME 338) Large-Scale Numerical Optimization Stanford University, Management Science & Engineering (and ICME MS&E 38 (CME 338 Large-Scale Numerical Optimization Course description Instructor: Michael Saunders Spring 28 Notes : Review The course teaches

More information

Generalized MINRES or Generalized LSQR?

Generalized MINRES or Generalized LSQR? Generalized MINRES or Generalized LSQR? Michael Saunders Systems Optimization Laboratory (SOL) Institute for Computational Mathematics and Engineering (ICME) Stanford University New Frontiers in Numerical

More information

CG VERSUS MINRES: AN EMPIRICAL COMPARISON

CG VERSUS MINRES: AN EMPIRICAL COMPARISON VERSUS : AN EMPIRICAL COMPARISON DAVID CHIN-LUNG FONG AND MICHAEL SAUNDERS Abstract. For iterative solution of symmetric systems Ax = b, the conjugate gradient method () is commonly used when A is positive

More information

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008. 1 ECONOMICS 594: LECTURE NOTES CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS W. Erwin Diewert January 31, 2008. 1. Introduction Many economic problems have the following structure: (i) a linear function

More information

The Newton Bracketing method for the minimization of convex functions subject to affine constraints

The Newton Bracketing method for the minimization of convex functions subject to affine constraints Discrete Applied Mathematics 156 (2008) 1977 1987 www.elsevier.com/locate/dam The Newton Bracketing method for the minimization of convex functions subject to affine constraints Adi Ben-Israel a, Yuri

More information

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES

4.8 Arnoldi Iteration, Krylov Subspaces and GMRES 48 Arnoldi Iteration, Krylov Subspaces and GMRES We start with the problem of using a similarity transformation to convert an n n matrix A to upper Hessenberg form H, ie, A = QHQ, (30) with an appropriate

More information

SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS

SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS HONOUR SCHOOL OF MATHEMATICS, OXFORD UNIVERSITY HILARY TERM 2005, DR RAPHAEL HAUSER 1. The Quasi-Newton Idea. In this lecture we will discuss

More information

Symmetric Matrices and Eigendecomposition

Symmetric Matrices and Eigendecomposition Symmetric Matrices and Eigendecomposition Robert M. Freund January, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 2 1 Symmetric Matrices and Convexity of Quadratic Functions

More information

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic Applied Mathematics 205 Unit V: Eigenvalue Problems Lecturer: Dr. David Knezevic Unit V: Eigenvalue Problems Chapter V.4: Krylov Subspace Methods 2 / 51 Krylov Subspace Methods In this chapter we give

More information

R-Linear Convergence of Limited Memory Steepest Descent

R-Linear Convergence of Limited Memory Steepest Descent R-Linear Convergence of Limited Memory Steepest Descent Fran E. Curtis and Wei Guo Department of Industrial and Systems Engineering, Lehigh University, USA COR@L Technical Report 16T-010 R-Linear Convergence

More information

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A. AMSC/CMSC 661 Scientific Computing II Spring 2005 Solution of Sparse Linear Systems Part 2: Iterative methods Dianne P. O Leary c 2005 Solving Sparse Linear Systems: Iterative methods The plan: Iterative

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT -09 Computational and Sensitivity Aspects of Eigenvalue-Based Methods for the Large-Scale Trust-Region Subproblem Marielba Rojas, Bjørn H. Fotland, and Trond Steihaug

More information

Large-scale eigenvalue problems

Large-scale eigenvalue problems ELE 538B: Mathematics of High-Dimensional Data Large-scale eigenvalue problems Yuxin Chen Princeton University, Fall 208 Outline Power method Lanczos algorithm Eigenvalue problems 4-2 Eigendecomposition

More information

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.

More information

Computational Linear Algebra

Computational Linear Algebra Computational Linear Algebra PD Dr. rer. nat. habil. Ralf Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2017/18 Part 3: Iterative Methods PD

More information

Absolute value equations

Absolute value equations Linear Algebra and its Applications 419 (2006) 359 367 www.elsevier.com/locate/laa Absolute value equations O.L. Mangasarian, R.R. Meyer Computer Sciences Department, University of Wisconsin, 1210 West

More information

Suppose that the approximate solutions of Eq. (1) satisfy the condition (3). Then (1) if η = 0 in the algorithm Trust Region, then lim inf.

Suppose that the approximate solutions of Eq. (1) satisfy the condition (3). Then (1) if η = 0 in the algorithm Trust Region, then lim inf. Maria Cameron 1. Trust Region Methods At every iteration the trust region methods generate a model m k (p), choose a trust region, and solve the constraint optimization problem of finding the minimum of

More information

Chapter 8 Cholesky-based Methods for Sparse Least Squares: The Benefits of Regularization

Chapter 8 Cholesky-based Methods for Sparse Least Squares: The Benefits of Regularization In L. Adams and J. L. Nazareth eds., Linear and Nonlinear Conjugate Gradient-Related Methods, SIAM, Philadelphia, 92 100 1996. Chapter 8 Cholesky-based Methods for Sparse Least Squares: The Benefits of

More information

Finite-choice algorithm optimization in Conjugate Gradients

Finite-choice algorithm optimization in Conjugate Gradients Finite-choice algorithm optimization in Conjugate Gradients Jack Dongarra and Victor Eijkhout January 2003 Abstract We present computational aspects of mathematically equivalent implementations of the

More information

Computational Linear Algebra

Computational Linear Algebra Computational Linear Algebra PD Dr. rer. nat. habil. Ralf-Peter Mundani Computation in Engineering / BGU Scientific Computing in Computer Science / INF Winter Term 2018/19 Part 4: Iterative Methods PD

More information

ITERATIVE REGULARIZATION WITH MINIMUM-RESIDUAL METHODS

ITERATIVE REGULARIZATION WITH MINIMUM-RESIDUAL METHODS BIT Numerical Mathematics 6-3835/3/431-1 $16. 23, Vol. 43, No. 1, pp. 1 18 c Kluwer Academic Publishers ITERATIVE REGULARIZATION WITH MINIMUM-RESIDUAL METHODS T. K. JENSEN and P. C. HANSEN Informatics

More information

Linear algebra issues in Interior Point methods for bound-constrained least-squares problems

Linear algebra issues in Interior Point methods for bound-constrained least-squares problems Linear algebra issues in Interior Point methods for bound-constrained least-squares problems Stefania Bellavia Dipartimento di Energetica S. Stecco Università degli Studi di Firenze Joint work with Jacek

More information

Research Article Some Generalizations and Modifications of Iterative Methods for Solving Large Sparse Symmetric Indefinite Linear Systems

Research Article Some Generalizations and Modifications of Iterative Methods for Solving Large Sparse Symmetric Indefinite Linear Systems Abstract and Applied Analysis Article ID 237808 pages http://dxdoiorg/055/204/237808 Research Article Some Generalizations and Modifications of Iterative Methods for Solving Large Sparse Symmetric Indefinite

More information

On the Computations of Eigenvalues of the Fourth-order Sturm Liouville Problems

On the Computations of Eigenvalues of the Fourth-order Sturm Liouville Problems Int. J. Open Problems Compt. Math., Vol. 4, No. 3, September 2011 ISSN 1998-6262; Copyright c ICSRS Publication, 2011 www.i-csrs.org On the Computations of Eigenvalues of the Fourth-order Sturm Liouville

More information

Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method

Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method Summary of Iterative Methods for Non-symmetric Linear Equations That Are Related to the Conjugate Gradient (CG) Method Leslie Foster 11-5-2012 We will discuss the FOM (full orthogonalization method), CG,

More information

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method

On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method Optimization Methods and Software Vol. 00, No. 00, Month 200x, 1 11 On the Local Quadratic Convergence of the Primal-Dual Augmented Lagrangian Method ROMAN A. POLYAK Department of SEOR and Mathematical

More information

Krylov Subspace Methods that Are Based on the Minimization of the Residual

Krylov Subspace Methods that Are Based on the Minimization of the Residual Chapter 5 Krylov Subspace Methods that Are Based on the Minimization of the Residual Remark 51 Goal he goal of these methods consists in determining x k x 0 +K k r 0,A such that the corresponding Euclidean

More information

Introduction to Iterative Solvers of Linear Systems

Introduction to Iterative Solvers of Linear Systems Introduction to Iterative Solvers of Linear Systems SFB Training Event January 2012 Prof. Dr. Andreas Frommer Typeset by Lukas Krämer, Simon-Wolfgang Mages and Rudolf Rödl 1 Classes of Matrices and their

More information

ITERATIVE METHODS BASED ON KRYLOV SUBSPACES

ITERATIVE METHODS BASED ON KRYLOV SUBSPACES ITERATIVE METHODS BASED ON KRYLOV SUBSPACES LONG CHEN We shall present iterative methods for solving linear algebraic equation Au = b based on Krylov subspaces We derive conjugate gradient (CG) method

More information

Course Notes: Week 1

Course Notes: Week 1 Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues

More information

The structure of iterative methods for symmetric linear discrete ill-posed problems

The structure of iterative methods for symmetric linear discrete ill-posed problems BIT manuscript No. (will be inserted by the editor) The structure of iterative methods for symmetric linear discrete ill-posed problems L. Dyes F. Marcellán L. Reichel Received: date / Accepted: date Abstract

More information

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization

E5295/5B5749 Convex optimization with engineering applications. Lecture 8. Smooth convex unconstrained and equality-constrained minimization E5295/5B5749 Convex optimization with engineering applications Lecture 8 Smooth convex unconstrained and equality-constrained minimization A. Forsgren, KTH 1 Lecture 8 Convex optimization 2006/2007 Unconstrained

More information

Lec10p1, ORF363/COS323

Lec10p1, ORF363/COS323 Lec10 Page 1 Lec10p1, ORF363/COS323 This lecture: Conjugate direction methods Conjugate directions Conjugate Gram-Schmidt The conjugate gradient (CG) algorithm Solving linear systems Leontief input-output

More information

A PREDICTOR-CORRECTOR PATH-FOLLOWING ALGORITHM FOR SYMMETRIC OPTIMIZATION BASED ON DARVAY'S TECHNIQUE

A PREDICTOR-CORRECTOR PATH-FOLLOWING ALGORITHM FOR SYMMETRIC OPTIMIZATION BASED ON DARVAY'S TECHNIQUE Yugoslav Journal of Operations Research 24 (2014) Number 1, 35-51 DOI: 10.2298/YJOR120904016K A PREDICTOR-CORRECTOR PATH-FOLLOWING ALGORITHM FOR SYMMETRIC OPTIMIZATION BASED ON DARVAY'S TECHNIQUE BEHROUZ

More information

1 Computing with constraints

1 Computing with constraints Notes for 2017-04-26 1 Computing with constraints Recall that our basic problem is minimize φ(x) s.t. x Ω where the feasible set Ω is defined by equality and inequality conditions Ω = {x R n : c i (x)

More information

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space

Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) 1.1 The Formal Denition of a Vector Space Linear Algebra (part 1) : Vector Spaces (by Evan Dummit, 2017, v. 1.07) Contents 1 Vector Spaces 1 1.1 The Formal Denition of a Vector Space.................................. 1 1.2 Subspaces...................................................

More information

ETNA Kent State University

ETNA Kent State University Electronic Transactions on Numerical Analysis. Volume 1, pp. 1-11, 8. Copyright 8,. ISSN 168-961. MAJORIZATION BOUNDS FOR RITZ VALUES OF HERMITIAN MATRICES CHRISTOPHER C. PAIGE AND IVO PANAYOTOV Abstract.

More information

A priori bounds on the condition numbers in interior-point methods

A priori bounds on the condition numbers in interior-point methods A priori bounds on the condition numbers in interior-point methods Florian Jarre, Mathematisches Institut, Heinrich-Heine Universität Düsseldorf, Germany. Abstract Interior-point methods are known to be

More information

Krylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17

Krylov Space Methods. Nonstationary sounds good. Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17 Krylov Space Methods Nonstationary sounds good Radu Trîmbiţaş Babeş-Bolyai University Radu Trîmbiţaş ( Babeş-Bolyai University) Krylov Space Methods 1 / 17 Introduction These methods are used both to solve

More information

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated.

Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated. Math 504, Homework 5 Computation of eigenvalues and singular values Recall that your solutions to these questions will not be collected or evaluated 1 Find the eigenvalues and the associated eigenspaces

More information

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294)

Conjugate gradient method. Descent method. Conjugate search direction. Conjugate Gradient Algorithm (294) Conjugate gradient method Descent method Hestenes, Stiefel 1952 For A N N SPD In exact arithmetic, solves in N steps In real arithmetic No guaranteed stopping Often converges in many fewer than N steps

More information

(v, w) = arccos( < v, w >

(v, w) = arccos( < v, w > MA322 Sathaye Notes on Inner Products Notes on Chapter 6 Inner product. Given a real vector space V, an inner product is defined to be a bilinear map F : V V R such that the following holds: For all v

More information

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min. MA 796S: Convex Optimization and Interior Point Methods October 8, 2007 Lecture 1 Lecturer: Kartik Sivaramakrishnan Scribe: Kartik Sivaramakrishnan 1 Conic programming Consider the conic program min s.t.

More information

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems Robert M. Freund February 2016 c 2016 Massachusetts Institute of Technology. All rights reserved. 1 1 Introduction

More information

The speed of Shor s R-algorithm

The speed of Shor s R-algorithm IMA Journal of Numerical Analysis 2008) 28, 711 720 doi:10.1093/imanum/drn008 Advance Access publication on September 12, 2008 The speed of Shor s R-algorithm J. V. BURKE Department of Mathematics, University

More information

(v, w) = arccos( < v, w >

(v, w) = arccos( < v, w > MA322 F all203 Notes on Inner Products Notes on Chapter 6 Inner product. Given a real vector space V, an inner product is defined to be a bilinear map F : V V R such that the following holds: For all v,

More information

Elementary linear algebra

Elementary linear algebra Chapter 1 Elementary linear algebra 1.1 Vector spaces Vector spaces owe their importance to the fact that so many models arising in the solutions of specific problems turn out to be vector spaces. The

More information

Preconditioned conjugate gradient algorithms with column scaling

Preconditioned conjugate gradient algorithms with column scaling Proceedings of the 47th IEEE Conference on Decision and Control Cancun, Mexico, Dec. 9-11, 28 Preconditioned conjugate gradient algorithms with column scaling R. Pytla Institute of Automatic Control and

More information

The Nearest Doubly Stochastic Matrix to a Real Matrix with the same First Moment

The Nearest Doubly Stochastic Matrix to a Real Matrix with the same First Moment he Nearest Doubly Stochastic Matrix to a Real Matrix with the same First Moment William Glunt 1, homas L. Hayden 2 and Robert Reams 2 1 Department of Mathematics and Computer Science, Austin Peay State

More information

Conjugate Gradient algorithm. Storage: fixed, independent of number of steps.

Conjugate Gradient algorithm. Storage: fixed, independent of number of steps. Conjugate Gradient algorithm Need: A symmetric positive definite; Cost: 1 matrix-vector product per step; Storage: fixed, independent of number of steps. The CG method minimizes the A norm of the error,

More information

OUTLINE 1. Introduction 1.1 Notation 1.2 Special matrices 2. Gaussian Elimination 2.1 Vector and matrix norms 2.2 Finite precision arithmetic 2.3 Fact

OUTLINE 1. Introduction 1.1 Notation 1.2 Special matrices 2. Gaussian Elimination 2.1 Vector and matrix norms 2.2 Finite precision arithmetic 2.3 Fact Computational Linear Algebra Course: (MATH: 6800, CSCI: 6800) Semester: Fall 1998 Instructors: { Joseph E. Flaherty, aherje@cs.rpi.edu { Franklin T. Luk, luk@cs.rpi.edu { Wesley Turner, turnerw@cs.rpi.edu

More information

MS&E 318 (CME 338) Large-Scale Numerical Optimization

MS&E 318 (CME 338) Large-Scale Numerical Optimization Stanford University, Management Science & Engineering (and ICME) MS&E 318 (CME 338) Large-Scale Numerical Optimization 1 Origins Instructor: Michael Saunders Spring 2015 Notes 9: Augmented Lagrangian Methods

More information

On the second smallest prime non-residue

On the second smallest prime non-residue On the second smallest prime non-residue Kevin J. McGown 1 Department of Mathematics, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093 Abstract Let χ be a non-principal Dirichlet

More information

chapter 12 MORE MATRIX ALGEBRA 12.1 Systems of Linear Equations GOALS

chapter 12 MORE MATRIX ALGEBRA 12.1 Systems of Linear Equations GOALS chapter MORE MATRIX ALGEBRA GOALS In Chapter we studied matrix operations and the algebra of sets and logic. We also made note of the strong resemblance of matrix algebra to elementary algebra. The reader

More information

The Fundamental Theorem of Linear Inequalities

The Fundamental Theorem of Linear Inequalities The Fundamental Theorem of Linear Inequalities Lecture 8, Continuous Optimisation Oxford University Computing Laboratory, HT 2006 Notes by Dr Raphael Hauser (hauser@comlab.ox.ac.uk) Constrained Optimisation

More information

Computing tomographic resolution matrices using Arnoldi s iterative inversion algorithm

Computing tomographic resolution matrices using Arnoldi s iterative inversion algorithm Stanford Exploration Project, Report 82, May 11, 2001, pages 1 176 Computing tomographic resolution matrices using Arnoldi s iterative inversion algorithm James G. Berryman 1 ABSTRACT Resolution matrices

More information

Unconstrained optimization

Unconstrained optimization Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout

More information

A NOTE ON Q-ORDER OF CONVERGENCE

A NOTE ON Q-ORDER OF CONVERGENCE BIT 0006-3835/01/4102-0422 $16.00 2001, Vol. 41, No. 2, pp. 422 429 c Swets & Zeitlinger A NOTE ON Q-ORDER OF CONVERGENCE L. O. JAY Department of Mathematics, The University of Iowa, 14 MacLean Hall Iowa

More information

Conjugate Gradient (CG) Method

Conjugate Gradient (CG) Method Conjugate Gradient (CG) Method by K. Ozawa 1 Introduction In the series of this lecture, I will introduce the conjugate gradient method, which solves efficiently large scale sparse linear simultaneous

More information

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL)

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL) Part 4: Active-set methods for linearly constrained optimization Nick Gould RAL fx subject to Ax b Part C course on continuoue optimization LINEARLY CONSTRAINED MINIMIZATION fx subject to Ax { } b where

More information

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM

TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM TMA 4180 Optimeringsteori KARUSH-KUHN-TUCKER THEOREM H. E. Krogstad, IMF, Spring 2012 Karush-Kuhn-Tucker (KKT) Theorem is the most central theorem in constrained optimization, and since the proof is scattered

More information

Interior-Point Methods for Linear Optimization

Interior-Point Methods for Linear Optimization Interior-Point Methods for Linear Optimization Robert M. Freund and Jorge Vera March, 204 c 204 Robert M. Freund and Jorge Vera. All rights reserved. Linear Optimization with a Logarithmic Barrier Function

More information

RITZ VALUE BOUNDS THAT EXPLOIT QUASI-SPARSITY

RITZ VALUE BOUNDS THAT EXPLOIT QUASI-SPARSITY RITZ VALUE BOUNDS THAT EXPLOIT QUASI-SPARSITY ILSE C.F. IPSEN Abstract. Absolute and relative perturbation bounds for Ritz values of complex square matrices are presented. The bounds exploit quasi-sparsity

More information

Chap 3. Linear Algebra

Chap 3. Linear Algebra Chap 3. Linear Algebra Outlines 1. Introduction 2. Basis, Representation, and Orthonormalization 3. Linear Algebraic Equations 4. Similarity Transformation 5. Diagonal Form and Jordan Form 6. Functions

More information

Nonlinear Programming

Nonlinear Programming Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week

More information

Weaker assumptions for convergence of extended block Kaczmarz and Jacobi projection algorithms

Weaker assumptions for convergence of extended block Kaczmarz and Jacobi projection algorithms DOI: 10.1515/auom-2017-0004 An. Şt. Univ. Ovidius Constanţa Vol. 25(1),2017, 49 60 Weaker assumptions for convergence of extended block Kaczmarz and Jacobi projection algorithms Doina Carp, Ioana Pomparău,

More information

The Conjugate Gradient Method for Solving Linear Systems of Equations

The Conjugate Gradient Method for Solving Linear Systems of Equations The Conjugate Gradient Method for Solving Linear Systems of Equations Mike Rambo Mentor: Hans de Moor May 2016 Department of Mathematics, Saint Mary s College of California Contents 1 Introduction 2 2

More information

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems

Topics. The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems Topics The CG Algorithm Algorithmic Options CG s Two Main Convergence Theorems What about non-spd systems? Methods requiring small history Methods requiring large history Summary of solvers 1 / 52 Conjugate

More information

Approximating the matrix exponential of an advection-diffusion operator using the incomplete orthogonalization method

Approximating the matrix exponential of an advection-diffusion operator using the incomplete orthogonalization method Approximating the matrix exponential of an advection-diffusion operator using the incomplete orthogonalization method Antti Koskela KTH Royal Institute of Technology, Lindstedtvägen 25, 10044 Stockholm,

More information

FEM and sparse linear system solving

FEM and sparse linear system solving FEM & sparse linear system solving, Lecture 9, Nov 19, 2017 1/36 Lecture 9, Nov 17, 2017: Krylov space methods http://people.inf.ethz.ch/arbenz/fem17 Peter Arbenz Computer Science Department, ETH Zürich

More information