Incomplete oblique projections for solving large inconsistent linear systems

Size: px
Start display at page:

Download "Incomplete oblique projections for solving large inconsistent linear systems"

Transcription

1 Math. Program., Ser. B (2008) 111: DOI /s FULL LENGTH PAPER Incomplete oblique projections for solving large inconsistent linear systems H. D. Scolnik N. Echebest M. T. Guardarucci M. C. Vacchino Received: 15 September 2004 / Accepted: 21 September 2005 / Published online: 17 January 2007 Springer-Verlag 2007 Abstract The authors introduced in previously published papers acceleration schemes for Projected Aggregation Methods (PAM), aiming at solving consistent linear systems of equalities and inequalities. They have used the basic idea of forcing each iterate to belong to the aggregate hyperplane generated in the previous iteration. That scheme has been applied to a variety of projection algorithms for solving systems of linear equalities or inequalities, proving that the acceleration technique can be successfully used for consistent problems. The aim of this paper is to extend the applicability of those schemes to the inconsistent case, employing incomplete projections onto the set of solutions of the augmented system Ax r = b. These extended algorithms converge to the least squares solution. For that purpose, oblique projections are used and, in particular, variable oblique incomplete projections are introduced. They are defined by means of matrices that penalize the norm of the residuals very strongly in the first iterations, decreasing their influence with the iteration counter in order to fulfill the convergence conditions. The theoretical properties of the new Dedicated to Clovis Gonzaga on the occassion of his 60th birthday. H. D. Scolnik (B) Departamento de Computación, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Pabellón I, Ciudad Universitaria, C1428EGA Buenos Aires, Argentina scolnik@fibertel.com.ar H. D. Scolnik hugo@dc.uba.ar N. Echebest M. T. Guardarucci M. C. Vacchino Departamento de Matemática, Facultad de Ciencias Exactas, Universidad Nacional de La Plata, La Plata, Argentina opti@mate.unlp.edu.ar

2 274 H. D. Scolnik et al. algorithms are analyzed, and numerical experiences are presented comparing their performance with several well-known projection methods. Keywords Projected Aggregation Methods Incomplete projections Inconsistent system Mathematics Subject Classification (2000) 90C25 90C30 1 Introduction Large and sparse systems of linear equations arise in many important applications [7, 12], as image reconstruction from projections, radiation therapy treatments planning, other image processing problems, computational mechanics, optimization, etc. Problems of image reconstruction from projections, after a suitable discretization [6], can be represented by a system of linear equations Ax = b, (1) where A is an m n matrix, m is the number of equations, n the size of the (unknown) image vector and b is the vector of readings. In practice, those systems are often inconsistent, and one usually seeks a point x R n that minimizes a certain proximity function. A common approach to such problems is to use algorithms, see for instance Censor and Zenios [5], which employ projections onto convex sets in various ways, using either sequential or simultaneous projections onto the hyperplanes represented by the rows of the complete system or onto blocks of a given partition. A widely used algorithm in Computarized Tomography is algebraic reconstruction technique (ART), whose origin goes back to Kaczmarz [15], although it is known that in order to get convergence it is necessary to use an underrelaxation parameter that must tend to zero. Among the simultaneous projection algorithms for inconsistent problems we can mention Cimmino, SART [5], Landweber [16], CAV [6] and the more general class of the diagonal weighting (DWE) algorithms [4]. In order to prove convergence of this sort of methods in the inconsistent case it is necessary to choose the relaxation parameter in an interval that depends on the largest eigenvalue of A T A [2]. Recently a block-iterative scheme based on Landweber s method, that not only encompass the above mentioned algorithms, but also uses a less restrictive condition for proving convergence was published [14]. The Projected Aggregation Methods (PAM), introduced by Householder and Bauer [13], are more efficient than those of the Cimmino or Kaczmarz type [5, 10], when the problems are consistent. Within this context we have developed acceleration schemes [9, 19 21] for these sort of algorithms based on projecting the search directions onto the aggregated hyperplanes, with excellent results. In this paper, we extend the above mentioned algorithms in order to compute the weighted least squares solution of inconsistent systems. We derive two

3 Incomplete oblique projections 275 new algorithms that use a scheme of incomplete oblique projections onto the solution set of the augmented system Ax r = b, for minimizing the proximity function and for converging to a weighted least squares solution of the system (1). More explicitly, in order to solve a possibly inconsistent system Ax = b, A R m n, b R m, m > n, we consider the standard problem: min x R n b Ax 2 D m, (2) where. Dm denotes the norm induced by the positive definite diagonal matrix D m R m m, whose solutions coincide with those of the problem A T D m Ax = A T D m b. We define two convex sets in the (n + m)-dimensional space R n+m, denoting by [u; v] the vertical concatenation of u R n,withv R m, P ={p : p = [x; r] R n+m, x R n, r R m, Ax r = b}, and Q ={q : q =[x;0] R n+m, x R n,0 R m }, adopting the distance d(p, q) = p q D, for all p P, q Q. D is a diagonal matrix of order n + m, whose n first elements are 1 s, and the last m coincide with those of D m. By means of a direct application of the Karush Kuhn Tucker (KKT) conditions [5] to the problem min { p q 2 D : for all p P, for all q Q}, (3) we obtain that its solutions p P, p =[x ; r ], and q =[y ;0] Q, satisfy x = y, A T D m r = 0, and p q 2 D = r 2 D m. Therefore this problem is equivalent to problem (2). This observation led us to develop a new method for solving (2), applying an alternate projections scheme between the sets P and Q, similar to the one of Csiszár and Tusnády [8], but replacing the computation of the exact projections onto P by suitable incomplete or approximate projections, according to the following basic scheme Algorithm 1 (Basic Algorithm) Iterative step : Given p k =[x k ; r k ] P, q k =[x k ;0] Q, find p k+1 a =[x k+1 ; r k+1 ] P as: p k+1 a arg min{ p q k 2 D : p P}, then define p k+1 = p k+1 a, and q k+1 Q by means of q k+1 =[x k+1 ;0] arg min{ p k+1 q 2 D : q Q}. In order to compute the incomplete projections onto P we apply our ACIMM algorithm as presented in [19], and later on named ACCIM in [21] and the subsequent papers. This iterative algorithm uses simultaneous projections onto the hyperplanes of the constraints of Ax r = b, and is very efficient for solving consistent problems and convenient for computing approximate oblique projections with some required properties.

4 276 H. D. Scolnik et al. In the following sections we will present two algorithms based on the same basic scheme and will prove their convergence. One keeps constant the initial matrix D and its induced norm, while the other is a dynamic version in which the matrix of weights D m changes with the iteration counter. The entries of that matrix are large at the beginning and decrease progressively until reaching convergence. The matrices defining the oblique projections arise from the sparsity pattern of the problem. For the second algorithm they also serve the purpose of penalizing the residuals in each iteration. At the beginning a strong penalization leads to a substantial decrease of the norm of some components of the residual and later on smaller values allow to converge to a desired solution. The new algorithms can be easily parallelized and implemented for dealing with blocks of constraints. The algorithms are compared with ART [5], and CAV [6], using a problem of image reconstruction from projections. The implementations were done within SNARK93 [1], a software package for testing and evaluating algorithms for image reconstruction problems, using the most efficient values for the relaxation parameters as reported in [6]. We also present other numerical experiences for comparing the performance of the new algorithms with those of Landweber [16], Double ART [2] and the Kaczmarz Extended [18]. The paper is organized as follows: in Sect. 2 we briefly review some notation and results necessary for describing the new algorithms. In Sect. 3 the new oblique projection IOP and IVOP algorithms are presented, together with some related results needed for defining them. The convergence theory is in Appendix B. In Sect. 4 some numerical experiences are described, together with some preliminary conclusions. In Appendix A we give some properties of the ACCIM algorithm that were not published before, needed for understanding its application to the new algorithms and for proving their convergence. 2 Projections algorithms Let us assume we have a system Ax = b, A R m n, m n, b R n.ifitis compatible, we will denote by x any solution of it. From hereafter x denotes the Euclidean norm of x R n, and x D the norm induced by a positive definite matrix D. We will also assume that each row of A has a Euclidean norm equal to 1. We use the notation e i for the ith column of I n, where the symbol I n denotes the identity matrix in R n n. We will use the upper index T for the transpose of a matrix. Given M R n r we denote by m i the ith column of M T, and by R(M) the subspace spanned by the columns of M, P M and PM D the orthogonal and the oblique projectors onto R(M), whose formulas under the assumption that rank(m) = r are P M = M(M T M) 1 M T and PM D = M(MT DM) 1 M T D, respectively. We will use the notation R(M) for the D-orthogonal subspace to R(M), and by P D the corresponding projector. In M particular, if M =[v] R n 1, we will use P D. We adopt the convention that if v v = 0 R n, P D = I v n. We denote a diagonal matrix of order m,byd = diag(d), where d = (d 1,..., d m ).

5 Incomplete oblique projections 277 For each constraint of Ax = b, we denote by L i ={x R n : a T i x = b i}, r i (x) = a T i x b i, and the oblique projection of x onto L i by P D i (x) = x r i(x) a T i D 1 a i D 1 a i. (4) An oblique version of ACCIM [21], a parallel simultaneous projection algorithm for solving consistent problems, using weights w i 0, is described by: Algorithm 2 (ACCIM) Initialization: Given x 0, v = 0; set k 0. Iterative Step: Given x k and the direction v, Compute d k = m i=1 w i (Pi D (x k ) x k ), and ˆd k = P D d k, and x k+1 = x k + v λ ˆdk k, being λ k as given by v ˆd k ; k k + 1. mi=1 w i Pi D (x k ) x k 2 D λ k =. (5) ˆd k 2 D Remark 1 The convergence properties of this algorithm were published in [19] and other related results in [21]. In particular, the value of (5) corresponds to the solution of the problem min λ x k + λˆd k x 2 D. (6) The necessary condition of optimality for this problem is also sufficient because of its convexity, and therefore we have that in every iteration the following property holds (x k + λ ˆdk k x ) T Dˆd k = 0. (7) Furthermore, it follows immediately from the definition of λ k, that the product λ ˆdk k, is independent of the length considered in d k. Therefore, this justifies to use for the w i only the condition w i 0, without m i=1 w i = 1. In Appendix A, we will present some properties of ACCIM required for analyzing the new algorithms. For the moment, we just point out that the sequence {x k } generated by ACCIM, from the initial point x 0, converges to the solution x of Ax = b, satisfying x = arg min { x x 0 2 D, x R n : Ax = b }. (8) 3 Inconsistent case As said in the Sect. 1, for possibly inconsistent systems Ax = b, A R m n, b R m, m > n, we will consider the standard problem (2), whose solution

6 278 H. D. Scolnik et al. coincides with the one of the problem: A T D m Ax = A T D m b. (9) When rank(a) = n and the diagonal matrix of weights D m is positive definite, we could extend directly the ACCIM algorithm to the inconsistent case if in the computation of λ k,thed-norm is replaced by the A T D m A-norm considering λ k = arg min λ 0 xk + λˆd k x 2 A T D m A, (10) defining d k like in ACCIM, and ˆd k = P v (d k ),withv = A T D m Aˆd k 1,an approach that implies a high computational cost. In this paper we present algorithms that are particular cases of the Algorithm Incomplete oblique projections algorithm We consider a diagonal weighting matrix D R (n+m) (n+m), being D = ( ) In 0, (11) 0 D m and the sets P and Q formerly introduced. Given p k P, and its projection q k onto Q, we denote by p D min (qk ) the projection of q k onto P, which is the solution of the problem min{ p q k D : p P}. (12) In particular, if p =[x ; r ] P,withx solution of (9), and its projection onto Q, q =[x ;0], we get p = p D min (q ), and the minimum distance between the sets P and Q is r 2 D m. In the new algorithms, given q k Q instead of defining p k+1 = p D min (qk ), we define p k+1 = p k+1 a, where p k+1 a P is a point obtained by means of the incomplete resolution of the problem (12). Remark 2 The results in [3] allow us to prove that the sequence given by the Algorithm 1 is convergent when p k+1 = p D min (qk ). It is interesting to note that the iterative step in regard to x involving the exact oblique projection taking into account the KKT-conditions for (12) as 1 minimize 2 [x; r] [xk ;0] 2 D subject to Ax r = b,

7 Incomplete oblique projections 279 can be written as x D min = xk + A T D m (b Ax D min ) if pd min (qk ) = [ x D min ; rd min]. (13) Thus, for non-singular matrices A T D m A, the basic scheme with exact projections as in Csiszár and Tusnády [8] generates a sequence that, in relation to the solution x of (9), satisfies: x D min x = (I n + A T D m A) 1 (x k x ). (14) From (14) it follows that the exact scheme always converges but its convergence rate depends on the choice of the matrix D m. For instance, if we could choose a matrix D m leading to an upper bound much less than one for the norm (I n +A T D m A) 1 = 1 1+λ min (A T D m A), where λ min stands for the least eigenvalue of a symmetric positive definite matrix, then we would get very fast convergence. The idea to study is to select D m in such a way that λ min (A T D m A)>λ min (A T A). Since A T D m A is positive definite, and u T A T D m Au, for all u R n, u =1, then multiplying and dividing by Au 2, we get u T A T D m Au λ min (D m )λ min (A T A), and therefore λ min (A T D m A) λ min (D m )λ min (A T A). This means that when λ min (A T A) 1, it would be enough to define a diagonal matrix D m with entries greater than one, while if λ min (A T A)<<1weneed that λ min (D m ) be of the order of 1 λ min (A T A). Then, it would be very helpful to be able to obtain estimations of the least eigenvalue of symmetric positive definite matrices for getting higher rates of convergence of this sort of iterative methods. On the other hand, the Eq. (13) is algebraically similar to the one that iterates of the class of Landweber s methods satisfy [2]: and in regard to x, x k+1 = x k + γ A T D m (b Ax k ), (15) x k+1 x = (I n γ A T D m A)(x k x ). (16) In order to prove convergence [2,14] it is necessary that γ (0, 2/L], L being the largest eigenvalue of A T D m A. This condition requires a good estimation of L. In[2] there are interesting results in regard to the estimation of L using the maximum number of non-zero elements in the columns of A. As mentioned before the algorithm based on exact projections is always convergent but its computational cost is high. Therefore we will present the theory of inexact projections aiming at getting similar convergence properties but with a much lower computational cost. In order to define the inexact projection pa k+1 p D min (qk ), we consider the following:

8 280 H. D. Scolnik et al. Definition 1 Given an approximation ˆp =[z; µ], z R n, µ R m,ofp D min (qk ), we denote by P(ˆp) =[z; µ + s], the solution of the system Ax r = b that satisfies µ + s = Az b. Aiming at obtaining properties of the sequence {p k } generated by the new algorithm that guarantees convergence to the solution of (3) we establish an acceptance condition that an approximation ˆp =[z; µ] of p D min (qk ) must satisfy. Definition 2 (Acceptance condition) An approximation ˆp =[z; µ] of p D min (qk ) is acceptable if it satisfies that ˆp q k 2 D pd min (qk ) q k 2 D and ˆp P(ˆp) 2 D γ ˆp pk 2 D with 0 <γ <1. (17) Remark 3 In particular, ˆp = p D min (qk ) satisfies (17). In order to see that it is feasible to find an approximation ˆp of p D min (qk ),by an iterative algorithm, satisfying (17) it is necessary to consider its convergence properties. We will use the ACCIM algorithm and, taking into account its properties, we will prove that it is possible to find a j > 0, such that [z j ; µ j ] satisfies (17). Lemma 1 Given p k =[x k ; r k ],q k =[x k ;0]. If the sequence {[z j ; µ j ]} is generated by ACCIM algorithm, with [z 0 ; µ 0 ]=q k, then (i) If p D min (qk ) = p k, an index ĵ exists such that [zĵ; µĵ] satisfies (17). (ii) If p D min (qk ) = p k, then the unique solution of (17) isp k. Proof Assuming that p k p D min (qk ) D = 0, by property (iii) of the Lemma 2 in Appendix A, every approximation ˆp satisfies the Fejér monotonicity condition ˆp q k 2 D pd min (qk ) q k 2 D. In order to see that it also fulfills the remaining condition (17), we will assume that the approximation ˆp =[z j ; µ j ] satisfies ˆp P(ˆp) 2 D >γ ˆp pk 2 D for all j > 0. (18) Due to the properties of ACCIM, ˆp goes to p D min (qk ). If we denote s j = Az j µ j b, we know that s j Dm goes to 0. From the definition of P(ˆp) we get that ˆp P(ˆp) 2 D = sj 2 D m, and since p D min (qk ) is the solution of (12) we know that p D min (qk ) q k is D-orthogonal to p p D min (qk ), for all p P. Furthermore, due to Lemma 2 of Appendix A, ˆp q k is D-orthogonal to p ˆp, for all p P, then we get that ˆp satisfies ˆp p k 2 D = ˆp pd min (qk ) 2 D + pd min (qk ) p k 2 D. Therefore, assuming the inequality (18) holds it leads to s j 2 D m >γ p D min (qk ) p k 2 D,for all j > 0. Since s j Dm tends to zero, we get p D min (qk ) p k 2 D = 0. Therefore, because of the hypothesis p D min (qk ) p k 2 D = 0, an index ĵ must exists for which ˆp =[zĵ; µĵ], satisfies the condition (17). When p D min (qk ) = p k, as a consequence of (iii) of Lemma 2 in Appendix A, we get that ˆp p k 2 D ˆp P(ˆp) 2 D, for all ˆp =[zj ; µ j ]. Therefore, the unique vector satisfying (17) isˆp = p D min (qk ) = p k.

9 Incomplete oblique projections 281 In order to introduce the alternate incomplete projections algorithm, we define the approximation p k+1 a of p D min (qk ),by p k+1 a = P(ˆp), if ˆp =[z j ; µ j ] satisfies (17). (19) Then, once p k+1 =[x k+1 ; r k+1 ]=P(ˆp) is computed, the second step of the new projection algorithm defines q k+1 =[x k+1 ;0] Q. We describe in the following a practical algorithm, that uses the ACCIM version given in Appendix A. Algorithm 3 Incomplete oblique projections (IOP) Initialization: Given 0 <γ <1, a positive definite diagonal matrix D m of order m and p 0 =[x 0 ; r 0 ],setr 0 = Ax 0 b, q 0 =[x 0 ;0] Qand k 0. Iterative Step: Given p k =[x k ; r k ],setq k =[x k ;0]. Calculate ˆp, approximation of p D min (qk ) satisfying (17), applying ACCIM as follows: define y 0 =[z 0 ; µ 0 ]=q k the initial point. For solving Ax r = b, iterate until finding y j = [z j ; µ j ], such that s j = Az j µ j b satisfies (17), that is s j 2 D m γ ( r k 2 D m S j ), withsj = y j y 0 2 D. Define p k+1 =[x k+1 ; r k+1 ], being x k+1 = z j, and r k+1 = µ j + s j. Define q k+1 =[x k+1 ;0] Q. k k + 1. Remark 4 The inequality used in the algorithm to accept [z j ; µ j ], s j 2 D m γ( r k 2 D m S j ), withs j = y j y 0 2 D, (20) arises from replacing in the inequality of (17), ˆp P(ˆp) 2 D = sj 2 D m, and p k ˆp 2 D = (pk q k ) (ˆp q k ) 2. Then by (iii) of the Lemma 2 in Appendix A, we know that p k ˆp and ˆp q k are D-orthogonal, using the fact that p k q k 2 D = [0; rk ] 2 D = rk 2 D m, we get that p k ˆp 2 D = rk 2 D m ˆp q k 2 D. Moreover, and using again the results of the Lemma 2 in Appendix A, ˆp q k 2 D = yj y 0 2 D. The convergence results are given in Lemmas 3 and 4 of Appendix B. 3.2 Incomplete oblique projections using penalization parameters The equality (14) suggests that if the matrix D m penalizes each violated constraint the solution of the problem (2) will be obtained more rapidly. In particular, we can see that if a diagonal matrix D m = β 0 C m, β 0 > 0 is considered, with

10 282 H. D. Scolnik et al. C m = diag(c), c i > 0, i = 1,..., m, applying the formula (13) from an initial point y 0, the exact solution to the problem (12), we will have the expression x D min = (I n + β 0 A T C m A) 1 (y 0 + β 0 A T C m b) (21) which coincides with ( ) 1 1 ( ) 1 x D min = I n + A T C m A y 0 + A T C m b. (22) β 0 β 0 When β 0 > 0 is large enough, we see that such solution is an approximation to the solution of the problem A T C m Ax = A T C m b. In particular, if D m = I m + β 0 C m we get ( ( ) ) ( ( ) ) 1 1 x D min = I n + A T I m + C m A y 0 + A T I m + C m b. β 0 β 0 β 0 β 0 (23) Likewise, as in (22), if β 0 is large enough such a solution is an approximation to the solution of A T C m Ax = A T C m b. In the special case when C m = diag(e i ),if we penalize with the constant β 0, we would get almost a zero residual for the i- equation because the solution of [ ( ) ] ( ) 1 1 I n + A T I m + C m A x Dmin β 0 β = 1β0 y 0 + A T I m + C m b, (24) 0 1β0 almost satisfies a i a T i xd min = a ib i, and a i (a T i xd min b i) 0. With the purpose of getting a substantial reduction of the components of the residual corresponding to the solution of problem (12), we replace the matrix D m by a sequence of penalization matrices D k m. Intuitively speaking, we proposetouselargeelementsinthematrixd 0 m for penalizing some or all of the constraints according to given criteria or a previous knowledge of the problem. Later on the penalization should become progressively weaker in order to get convergence to a desirable solution. In Algorithm 3 we replace the D matrix defined in (11), by ( ) D k In 0 = 0 D k m (25) in each iteration, where D k m = I m + diag(α1 k,..., αk m ). For each i = 1,..., m, αi k > 0, is a penalty parameter corresponding to the ith equation. We consider a decreasing sequence {αi k} k, such that lim k αi k = 0, for i = 1,..., m. We define for all p P and q Q, d(p, q) = p q D k.

11 Incomplete oblique projections 283 The new algorithm, using variable oblique projections induced by D k,isthe following: Algorithm 4 Incomplete variable oblique projections (IVOP) Initialization: Given 0 <γ <1, a positive definite diagonal matrix D 0 m of order m and p 0 =[x 0 ; r 0 ],setr 0 = Ax 0 b, q 0 =[x 0 ;0] Qand k 0. Iterative Step: Given p k =[x k ; r k ] and D k m,setqk =[x k ;0]. Calculate ˆp, an approximation of p Dk min (qk ) satisfying (17), applying the version of ACCIM with oblique projections induced by the D k matrix as follows: Define y 0 =[z 0 ; µ 0 ]=q k the initial point. For solving Az µ = b, iterate until finding y j =[z j ; µ j ] such that s j = Az j µ j b satisfies (17), that is s j 2 γ( r k 2 S D k m D k j ), withs j = y j y 0 2. m D k Define p k+1 =[x k+1 ; r k+1 ], being x k+1 = z j and r k+1 = µ j + s j. Define q k+1 =[x k+1 ;0] Q. Update D k+1 m ; k k + 1. It is interesting to point out the difference between the solution of the problem (12) using the norm induced by D k and that obtained with the Euclidean norm in an algorithmic scheme as the one here proposed. We denote by. α k the vector norm induced by the matrix diag(α k ) defined in (25). Remark 5 Given q k =[x k ;0] Q, if[x min ; r min ] is the solution of the problem (12) withd m = I m, and [x Dk min ; rdk min ] is the solution of the similar problem with D m = D k m, then we obtain the following. By definition of [x min; r min ], for all [x; r] P, satisfies that x min x k 2 + r min 2 x x k 2 + r 2, then in particular we have that x min x k 2 + r min 2 x Dk min xk 2 + rmin Dk 2. On the other hand, [x Dk min ; rdk min ] satisfies xdk min xk 2 + rmin Dk 2 x D k min x k 2 + r min 2 m Dm. k Therefore, considering that rmin Dk 2 = r D k min Dk 2 + rmin Dk 2, and replacing in m α k the former expression we get, x Dk min xk 2 + rmin Dk 2 + rmin Dk 2 x α k min x k 2 + r min 2 + r min 2. Hence, r Dk α k min 2 < r α k min 2. Then, we observe that α k α k i ((rmin Dk i ) 2 (r mini ) 2 )<0, and this shows that a subset of components of the residual rmin Dk exists whose magnitudes are significantly less than those of r min, and such that its sum is greater than the sum of the remaining components. The results on the convergence of the IVOP Algorithm are given in Lemmas 3 and 4 of Appendix B. 4 Numerical experiences The objectives of the following numerical experiences are twofold. The first case aims to illustrate the behavior of our algorithms in the solution of image

12 284 H. D. Scolnik et al. reconstruction problems. In this case, aiming at displaying the quality of the reconstructed image using different methods, we compare our results with those obtained with ART [5], and CAV [6]. Those algorithms have been used with the same purpose in recent papers. The second case aims to compare our algorithms (Subsect ) with other methods, reporting the number of iterations needed for satisfying the convergence conditions and the corresponding CPU time. Here we used the following methods: Landweber (LANW) [16], with relaxation parameter, Double ART(DART) [2], and Kaczmarz extended (KE) in [18], that are simultaneous and sequential projection methods. All methods were implemented sequentially and the experiences were run on a PC AMD Sempron 2.6, with 224 MB RAM. In the implementations of IOP and IVOP we consider: 1. The parameter γ = 10 2 in the initial iteration (k = 0), then γ = 0.5 (k > 0). 2. The matrix D m : in IOP equal to I m, and in IVOP D k m = I m + diag(α k ), with αi k = max(1, r0 i )+ a i ni, being n 2 k i the number of nonzero elements of the ith row of A. 3. The stopping condition: If r k+1 Dm r k Dm < 10 4 max( r k Dm,1) The matrix D k m used in the implementation of IVOP was the one that gave the best performance using SNARK93 problems in regard to the descent of the norm of the residual, among several alternatives. They arose from the idea of penalizing in a stronger way those constraints having a higher average of initial residuals ((ri 0)2 /n i ) or those having a larger average length a i 2 /n i.forthe class of SNARK93 problems each a ij represents the length of the ith ray within the jth cell. This means that the higher the previous quotient, the more meaningful is the information provided by that ray in regard to the cells it intersects. Therefore, the corresponding equation should have a higher weight. 4.1 Experimental results on an image reconstruction problem from projections We compared our algorithms using an image reconstruction problem from projections. All methods were implemented sequentially using SNARK93 [1] which is composed by a number of modules written in FORTRAN 77 whose aim is to test new algorithms for image reconstruction problems. The SnarkDisplay module allows to visualize both the phantoms (objects to be reconstructed) as well as the obtained reconstructions in each iteration. Another useful tool is Snark- Eval (C++) that graphically presents the curves of the evaluated parameters and compare the results obtained with different algorithms. We show the performance of the algorithms on the reconstruction of the Herman head phantom [11], defined by a set of ellipses, with a specific attenuation value attached to each original elliptical regions. As a consequence of the discretization of the image reconstruction problem by X-rays transmission, a Cartesian grid of image square elements called pixels is defined in the region of interest in such a way that it covers the totality of the image to be reconstructed.

13 Incomplete oblique projections 285 It is assumed that the attenuation function of the X-rays takes a constant value x j in the jth pixel for j = 1, 2,..., n. It is also assumed that the length of the intersection of the ith ray with the jth pixel, denoted by a ij for all i = 1, 2,..., m, j = 1, 2,..., n, represents the weights of the contribution of the jth pixel to the total attenuation of energy along the ith ray. The physical measurement of the total attenuation along the ith ray, denoted by b i, represents the line integral of the unknown function along the ray s path. Therefore, in the discrete model, the line integral turns out to be a finite sum, and the model is given by the linear system Ax = b. This system is very large because n (number of pixels) and m (number of rays) are each of the order 10 5 in order to get high resolution images. The A matrix is highly sparse because less than 1% of the entries are different from zero, because only a few pixels have a non-empty intersection with a particular ray. The systems (see Censor et al. [6] for a more complete description) are generally overdetermined and inconsistent, because in the fully discretized model each line integral of the attenuation along the ith ray is approximated by a finite sum. With a suitable selection of the variables that define the geometry of the data [1] such as angles between rays, the distances between rays when they are parallel, the localization of the sources, etc, the generated problems have full rank matrices Comparisons Different test cases, using the Herman head phantom (Fig. 1), were run. Those tests are the same appearing in [6], that arise from considering a different number of projections and number of rays per projection, leading in such a way to systems with a different number of variables and equations, as shown in Table 1. The performance of IOP and IVOP algorithms are compared with that of CAV2, with relaxation parameter λ k = 2, and ART with λ k = 0.1. This Fig. 1 Herman head phantom

14 286 H. D. Scolnik et al. Table 1 Test cases m n Image size Projections Rays Case Case Case Fig. 2 Relative error for case 2 particular choice of the parameters is the one reported by Censor et al. [6] as those which gave the best results. It is necessary to point out that every iteration of the IOP and IVOP algorithms usually requires more than one inner iteration of ACCIM which in turn demands to use all the equations of the system, as the other algorithms do. In general we can observe that IOP and IVOP need few inner iterations at the beginning, while that number increases when becoming close to the solution. All experiments used x 0 = 0. We show graphs that compare the four algorithms in cases 2 4. In Figs. 2, 3 and 4 the curves representing the relative error of the density with regard to the phantom s density, for each algorithm and up to the 20th iteration are shown. In the following, we will explain the meaning of this variable. Let S denote the set of pairs (i, j) such that the pixel in the ith row and jth column is in the region of interest phantom. We use ρij k to denote the density

15 Incomplete oblique projections 287 Fig. 3 Relative error for case 3 assigned to the pixel in the ith row and jth column of the reconstruction after k iterations, if k 1, and the density in the corresponding pixel in the phantom if k = 0. We denote by n = N 2 the total numbers of pixels of the grid, and by ρ 0 = ij S ρ0 ij, the total density of the phantom. The relative error is R k = (i,j) S ρk ij ρ0 ij We can observe a very efficient behavior of IVOP because in a few iterations it reaches the minimum error with regard to the seeked image within the first 20 iterations. IVOP got the least error image in the third outer iteration (17 inner iterations) for cases 3 and 4 and in the second outer iteration (7 inner iterations) for case 2. In Figs. 5, 6 and 7 we compare the quality of the reconstructed image by each algorithm when they reach the image with least relative error, among the first 20 iterations, in regard to the seek image. The reconstructed image given by IVOP and IOP in the case 3 is better than those obtained by the other algorithms. In case 4 with a finer grid, the quality is similar to the one obtained with ART. The CAV2 algorithm requires a higher number of iterations for achieving that quality. ρ 0

16 288 H. D. Scolnik et al. Fig. 4 Relative error for case 4 In Table 2 we give the CPU times required by each algorithm for reaching the least relative error solution. IOP and IVOP use more CPU time than ART in this sequential implementation although the quality of the reconstructed image is better in case 3. Such a difference must vanish if the parallel properties of IVOP and IOP are exploited. For IOP and IVOP we give number of inner iterations/number of outer iterations. 4.2 Numerical results (second case) Here we compare our algorithms with the LANW method using the relaxation parameter γ = 2/L, where L stands for an estimation of the maximum eigenvalue of A T A given in [2], and the Double ART Algorithm (DART) [2]. The stopping criteria for the Landweber s algorithm is the same than the one for IVOP. The stopping criteria for the two processes involved in DART were the magnitude of the residuals of the respective consistent systems A T w = 0 and Ax = b w f, where w f is the obtained solution for the first one. If Rnk is the residual in the kth iteration, the algorithm stops: If Rnk < 10 4, for solving A T w = 0, and If Rnk < 10 4,forAx = b w f. The test problems used for these numerical experiences are some of the generated by SNARK93.

17 Incomplete oblique projections 289 Fig. 5 Case 2: comparison of the least relative errors for the reconstructed image In Table 3 the dimensions of the test problems are shown using m for the number of rows, n for the number of columns and nnul for the number of nonzero elements in the generated matrix. The results obtained with DART were extremely sensitive to the precision required for computing the solution w f of A T w = 0. Due to this reason the problems were solved with the KE Algorithm [18] which, given w k 1 and x k 1, computes w k by performing a complete Kaczmarz cycle for A T w = 0, and adapts x k by making a similar complete cycle in Ax = b w k, restarting at x k 1. The initial point for solving A T w = 0isw 0 = b. In this case, the used stopping criteria is: If Ax k b + w k < This is less strict than the one used by the author in [18], but we preferred this choice because it is enough for our comparisons. In Table 4 we show for each algorithm the number of iterations, the final residual, and the CPU time in seconds required for reaching convergence. For

18 290 H. D. Scolnik et al. Fig. 6 Case 3: comparison of the least relative errors for the reconstructed image the methods IOP and IVOP we also included the total number of inner iterations/outer iterations. For DART we reported the number of iterations needed for reaching the termination criteria for each one of the systems. It is possible to observe that the LANW algorithm leads to a final residual greater than those of the other algorithms, something that seems to be related to its convergence rate and to the fact the stopping criteria is not too tight. Anyhow we do believe this is enough for analyzing the algorithm s behavior. The final residuals are of the same order for all tested algorithms, but the number of iterations may vary drastically if one seeks for a lower feasible residual. In Table 4 it is possible to observe that both IOP and IVOP need a few outer iterations to converge with a residual similar to those of the other methods. The CPU time required for obtaining those residuals is less than the ones insumed by the other algorithms. Every inner iteration of the new algorithms is equivalent to a cycle through all equations of the system as LANW does. On the contrary,

19 Incomplete oblique projections 291 Fig. 7 Case 4: comparison of the least relative errors for the reconstructed image Table 2 CPU time required for reaching the least relative error Method Least relative error (iter/20) CPU time (s) Case 2 ART CAV IVOP 7/2 5.9 IOP 17/8 8.8 Case 3 ART CAV IVOP 17/ IOP 23/ Case 4 ART CAV IVOP 17/ IOP 23/

20 292 H. D. Scolnik et al. Table 3 Numerical tests Problem m n nnul B B B B B Table 4 Numerical tests Problem/method IOP IVOP LANW DART KE B1 Iter 63/11 67/ Rnk CPU B2 Iter 215/58 201/ Rnk CPU B7 Iter 184/24 174/ Rnk CPU B8 1 Iter 142/14 174/ Rnk CPU B8 2 Iter 558/87 605/ Rnk CPU each iteration of KE requires two Kaczmarz cycles. This explains why in the Table 4 it uses fewer iterations but the CPU time is greater than the one used by IOP and IVOP. DART very often uses many iterations for solving A T w = 0, but just a few for Ax = b w f. This conflict arises from the termination criteria used for solving A T w = 0. We believe that KE is very efficient for solving small problems but not for larger ones. Aiming at exhibiting the behavior of the different algorithms in regard to the reduction of the norm of the residuals in the first iterations, we compare in Table 5 their magnitudes after a prescribed number of iterations (Iter p ). After knowing the number of iterations required by IOP and IVOP for completing the first, second and third iteration, we took the maximum of both as Iter p.in this table we excluded DART because before solving the compatible system Ax = b w f it had to solve A T w = 0, performing the necessary number of iterations until getting convergence. Therefore, the iterative procedures cannot be compared. We denote by Rn 0 the Euclidean norm of the initial residual. From the comparisons in Table 5 we see that IVOP minimizes more rapidly the norm of the residual in the first outer iterations than any other algorithm, although it uses a greater number of inner iterations that IOP. Also KE needs the double of cycles per iteration, and therefore for the same number of iterations the corresponding CPU time practically duplicates those of IOP and IVOP.

21 Incomplete oblique projections 293 Table 5 Comparing the descent of the residual s norms Problem/method IOP IVOP Iter p LANW KE B / / / /8 Rn 0 = / / / / / / / /20 B / / / /4 Rn 0 = /6 9.13/ / /7 7.46/ / / /11 B / / / /12 Rn 0 = / / / / / / / /29 B / / /4 1322/26 Rn 0 = / / / / / / / /10 B / / / /4 Rn 0 = /6 8.93/ / /7 6.83/8 5.92/ / /10 As a conclusion we can say the new algorithms give very good results when compared with the other ones, and they are also very useful for image reconstruction problems because they minimized the norm of the residual very rapidly in a few iterations. 5 Conclusions The general acceleration scheme within the PAM framework, chosen for dealing with inconsistent systems presented in this paper, turned out to be very efficient when applied to different projection algorithms. The ACCIM algorithm used for computing IOP, can be easily implemented in parallel and can also be extended for considering blocks of equalities in a similar way as it was done for the ACIPA algorithm in [9] for linear inequalities. That extension will be presented in a forthcoming paper, together with the results of its application to other problems as well as several alternatives for choosing the matrices D m in more general contexts. The practical consequences of the new technique are far reaching for realworld reconstruction problems in which iterative algorithms are used, and in other fields where there is a need to solve large and unstructured systems of linear equations. Acknowledgments To the unknown referees for their very interesting comments and suggestions. A Appendix A.1 Properties of the ACCIM algorithm First, we will point out some properties of the ACCIM algorithm which is the basis for computing approximate solutions of the problem (12) in every

22 294 H. D. Scolnik et al. iteration of the Algorithms 3 and 4. Those results are needed for proving the convergence of the new algorithms. Given a consistent system Āy = b, where Ā R m n, we denote by y to any solution of it. Let {y j } be the sequence generated by a version of the ACCIM algorithm with a D-norm, we denote r j = Āy j b the residual at each iterate y j. The direction d j defined in the Algorithm 2 by the equality (4)is d j = m ( w l P D l (y j ) y j) m = l=1 l=1 w l r j l ā l 2 D 1 D 1 ā l. (26) From condition (7) we know that at each iterate y j = y, the direction ˆ d j = P v (d j ), where v = ˆd j 1, satisfies ( ˆ d j ) T D(y j y + λ j ˆ d j ) = 0, (27) and therefore the next iterate y j+1 = y j + λ ˆ jd j satisfies ( d ˆj ) T D(y j+1 y ) = 0. From the definition of d ˆj we know that d ˆj = d j (ˆd j 1 ) T Dd ˆj ˆd j 1 ; then multiplying d ˆj by D(y y j ), considering that by (27) (y y j ) T Dˆd j 1 = 0, we get ˆd j 1 2 D that ( d ˆj ) T D(y y j ) coincides with (d j ) T D(y y j ). Also, using the formula of d ˆj, and multiplying it by Dd ˆj it follows that ( d ˆj ) T Dˆd j coincides with d j Dˆd j since d ˆj is D-orthogonal to ˆd j 1. Thus, using these equalities in (27) we obtain that (d j ) T D(y y j ) λ j (d j ) T Dˆd j = 0. (28) Therefore, the next iterate y j+1 = y j +λ j ˆd j is such that y j+1 y is D-orthogonal to both directions d j and d ˆj. Lemma 2 If y j = y,j> 0, is generated by the ACCIM algorithm, then (i) d j is D-orthogonal to y y i, for all i < j. (ii) ˆd j is D-orthogonal to ˆd i and d i, for all i < j. Furthermore, (iii) y y j is D-orthogonal to ˆd i, for all i < j, and as a consequence is also D-orthogonal to y j y 0. Proof For j = 1, from (27) we know that (d 0 ) T D(y 1 y ) = 0, where y 1 = y 0 + λ 0 d 0. Then, using the definition of d 1 = m 1 w i D 1 ri ā 1 i, we see that ā i 2 D 1 (d 1 ) T D(y y 0 ) coincides with (d 0 ) T D(y 1 y ) = 0, hence follows (i). Also, by definition of ˆd 1, (ii) holds. Assuming that (i) and (ii) hold for a given j, j 1, we shall prove that they are also valid for j + 1. Thus, we assume that d j is D-orthogonal to y y i, for all i < j, and ˆd j is D-orthogonal to ˆd i and d i, for all i < j. Aty j+1 = y j + λ j ˆd j, the direction is d j+1 = m l=1 w l D 1 r j+1 l ā l, and ā l 2 D 1

23 Incomplete oblique projections 295 considering that r j+1 = r j + Ā(y j+1 y j ), we get d j+1 = d j m w l D 1 ā l ā T l ā l 2 (y j+1 y j ). (29) D 1 l=1 Multiplying (29) byd(y y j ) we obtain by the definition of y j+1,bythe obvious equality r j l = ā T l (y j y ) and by (26) and (28) that (d j+1 ) T D(y y j ) coincides with (d j ) T D(y y j ) m (y l=1 w j y ) T DD 1 ā l ā T l (y j+1 y j ) l, which is equal ā l 2 D 1 r j l to (d j ) T D(y y j )+( m l=1 w l ā ā l 2 l ) T λ j (ˆd j ) and to (d j ) T D(y y j ) λ j (ˆd j ) T D 1 Dd j = 0. Thus, d j+1 is D-orthogonal to y y j. Moreover, multiplying (29) by D(y y i ),fori < j, due to the inductive hypothesis we know that (d j ) T D(y y i ) = 0 and (d i ) T Dˆd j = 0, and therefore d j+1 is D-orthogonal to y y i, for i < j. Hence, we get that d j+1 is D-orthogonal to y y i,fori < j + 1. By definition ˆd j+1 = d j+1 (d j+1 ) T Dˆd j ) ˆd j, which means that ˆd j+1 is D-orthogonal ˆd j 2 D to ˆd j. Furthermore, since (i) we know that d j+1 is D-orthogonal to y y i for i < j + 1, then it is also D-orthogonal to y i y i 1,fori < j + 1. Then, considering that y i y i 1 = λ ˆdi 1 i 1, it follows that d j+1 is D-orthogonal to ˆd i 1,for i < j + 1. Then, using the inductive hypothesis, ˆd j is D-orthogonal to ˆd i, for all i < j. The conclusion is that ˆd j+1 is D-orthogonal to ˆd i for all i < j + 1. Furthermore, since ˆd i = d i (di ) T Dˆd i 1 ˆd i 1, for all i < j + 1, using that ˆd j+1 ˆd i 1 2 D is D-orthogonal to ˆd i and ˆd i 1, we also get that ˆd j+1 is D-orthogonal to d i,for all i < j + 1. Finally, to see (iii) we use the fact that as a consequence of (27) ˆd i is D-orthogonal to y i+1 y, for all i < j. On the other hand, using (ii) we know that ˆd i is also D-orthogonal to ˆd s,fori < s < j, then ˆd i is also D-orthogonal to y j y i+1. Hence, for all i < j, ˆd i is D-orthogonal to y j y. Therefore, y j y 0 = j 1 i=0 λ i ˆd i is D-orthogonal to y j y. A.2 Applying ACCIM to solve Ax r = b We will consider in the following the application of the ACCIM algorithm for solving Az I m µ = b, required in every step of Algorithms 3 and 4. Given p k and q k =[x k ;0], k 0, ACCIM computes an approximation [z j ; µ j ] to the projection p Dk min (qk ). In Algorithm 3 the matrix D k = D is constant [see (11)], while in Algorithm 4 the matrix is variable [see (25)]. Given q k = [x k ;0] Q, in order to solve (12), from the starting point [z 0 ; µ 0 ] = q k = [x k ;0], at each [z j ; µ j ], we denote by s j the residual s j =

24 296 H. D. Scolnik et al. Az j µ j b. The direction d j R n+m, can be written as d j = m i=1 w i s j i ā i 2 D k 1 (D k ) 1 ā i, (30) being ā i =[a i ; e i ] the i-column of [A, I m ] T, where e i denotes the i-column of I m. The square of the norm of each row of the matrix [A, I m ], induced by the inverse of D k,is 1+ 1/(D k m ) i,if a i =1 and (D k m ) i denotes the ith element of the diagonal of D k m. Hence, the direction d j =[d j 1 ; d j 2 ] has its first n components given by: and the next m-components are d j 1 = m i=1 d j 2 = (Dk m ) 1 w i (D k m ) i s i j 1 + (D k m ) a i (31) i m i=1 w i (D k m ) i s i j 1 + (D k m ) e i. (32) i We choose for this application of ACCIM the values w i : w i = 1 + (D k m ) i, because they privilege the rows with greatest 1 + (D k m ) i. Thus, at each iterate [z j ; µ j ], the directions are: d j m 1 = (D k m ) i a isi j = A T D k m ( s j ) and d j m 2 = e i si j = s j. (33) i=1 This direction d j satisfies that d j 1 = AT D k m ( d j 2 ). Similar behavior has the direction d ˆj =[ˆd j 1 ; ˆd j 2 ], where ˆd j 1 = AT D k m ( ˆd j 2 ), expression that follows from considering that ˆd j 1 = d j 1 (ˆd j 1 ) T D k d j j 1 ˆd ˆd j 1 2 1, and ˆd j 2 = d j 2 (ˆd j 1 ) T D k d j j 1 ˆd D k ˆd j 1 2 2, D k and using that ˆ d0 coincides with d 0. Hence, for each j > 0, [z j ; µ j ] satisfies z j z 0 = A T D k m ( µ j ). Remark 6 We will describe now the properties of the accepted approximation ˆp =[z j ; µ j ], in the iterations of the Algorithms 3 and 4. (i) From (iii) of the previous Lemma, we know that for every solution [z ; µ ] of Az µ = b, [z ; µ ] [z j ; µ j ] is D k -orthogonal to [z j ; µ j ] [z 0, µ 0 ]. Thus, for all p P, we have that (p ˆp) T D k (ˆp q k ) = 0. (ii) From (i) of the same Lemma we know that for every [z j ; µ j ], j > 0, the direction d j satisfies d jt D k ([z ; µ ] [z 0 ; µ 0 ]) = 0. Then, since p k P and q k =[z 0 ; µ 0 ], in particular d jt D k (p k q k ) = 0. Hence, using the expression of d j in (33), and considering that p k q k =[0; r k ], we get that s j is D k m -orthogonal to rk. i=1

25 Incomplete oblique projections 297 B Convergence of the algorithms Aiming at proving that the sequences {p k } generated by the two new algorithms are convergent we will first describe the properties shared by both. Since we will study the relationship between two consecutive iterates p k and p k+1, depending of the D-norm in Algorithm 3, and of the variable D k -norm in Algorithm 4, we will use the more general notation D k for both. We will distinguish in relation to both algorithms the following sets: L D m sq the set of solutions to the problem (2), L D m sq = { x R n : forwhich r = Ax b satisfies A T D m r = 0 }, (34) and the corresponding S D m = { p : p =[x ; r ] P such that x L D } m sq, (35) D m being the matrix of (11). In particular, D m = I m corresponds to the set of solutions of the classical least squares problem. For the sequence given by Algorithm 3 we will prove that converges to an element of the set S D m,withd m the matrix arising from (11), while for the sequence given by Algorithm 4 we will prove convergence to an element of the set of solutions of the classical least squares problem (D m = I m ). Given an initial point q 0 =[x 0 ;0] Q for both algorithms we consider x = arg and denote p =[ x ; r ] S D m, which satisfies p = arg because q 0 p 2 D = x0 x 2 + r 2 D m. min x 0 x 2 (36) x L Dm sq min q 0 p 2 p S Dm D, (37) Lemma 3 Let {p k }={[x k ; r k ]} be the sequence generated by either the Algorithm 3 or 4, then (i) p k =[x k ; r k ] and p k+1 =[x k+1 ; r k+1 ] satisfy r k+1 2 D k m r k 2 (1 D k m γ) p k p Dk min (qk ) 2. D k (ii) The sequence { r k D k m } is decreasing and bounded, therefore it converges. (iii) (iv) (v) The following three sequences tend to zero: { p k p Dk min (qk ) 2 }, D k { p k+1 p Dk min (qk ) 2 }, and { p k+1 p k 2 }. D k D k The sequence { A T D k m rk } goes to zero. If x = arg min x L Dm x x k 2, then x also fulfills that property in regard sq to x k+1.

26 298 H. D. Scolnik et al. Proof We get, as a consequence of the definition of p k+1, and the corresponding q k+1 that r k+1 2 p k+1 q k 2. Furthermore, due to the ACCIM D k m D k properties in Lemma 2 (iii) of Appendix A, p k+1 q k 2 = p k+1 ˆp 2 + D k D k ˆp q k 2. From the condition (17) for accepting ˆp, p k+1 ˆp 2 γ ˆp D k D k p k 2. Then, p k+1 q k 2 γ ˆp p k 2 + ˆp q k 2. Therefore, as a D k D k D k D k consequence of the Lemma 2 (iii) of Appendix A, ˆp q k 2 = r k 2 D k D k m ˆp p k 2, then we get that q k p k+1 2 γ ˆp p k 2 + r k 2 D k D k D k D k m ˆp p k 2. Hence, q k p k+1 2 r k 2 (1 γ) ˆp p k 2. Then, con- D k D k D k m D k sidering that ˆp p k 2 = ˆp p Dk D k min (qk ) 2 + p Dk D k min (qk ) p k 2, we get D k r k+1 2 q k p k+1 2 r k 2 (1 γ) p k p Dk D k m D k D k min (qk ) 2. m D k By (i), if p Dk min (qk ) = p k, it follows that r k+1 2 < r k 2 When the norm D k m Dm. k is the one induced by D k m = D m, it follows immediately that { r k D k m } is a decreasing sequence. In the case of Algorithm 4, due to the definition of the matrices D k in (25), we have that r k 2 r k 2, and also due to (i) it satisfies r k 2 < r k 1 2. Therefore also in this case { r k D k m D k 1 m D k 1 m D k 1 m D k m } is a decreasing sequence. Therefore, r k 2 is decreasing and convergent. Hence, considering D k m the inequality in (i) and the fact that r k+1 2 r k+1 2 < r k 2 D k+1 m D k m Dm,wealso k obtain that p k p Dk min (qk ) 2 tends to zero. D k Furthermore, since by (17) p k ˆp 2 1/γ ˆp p k+1 2, and considering that ˆp p k+1 2 = ˆp p Dk D k min (qk ) 2 + p k+1 p Dk D k min (qk ) 2, and p k D k D k D k ˆp 2 = ˆp p Dk D k min (qk ) 2 + p k p Dk D k min (qk ) 2, we get that p k p Dk D k min (qk ) 2 D k p k+1 p Dk min (qk ) 2. Hence, since p k p Dk D k min (qk ) goes to zero, it follows that p k+1 p Dk min (qk ) 2 tends to zero. D k Since p k+1 p k D k p k+1 p Dk min (qk ) D k + p Dk min (qk ) p k D k and the right-hand sum tends to zero, then also p k+1 p k D k tends to zero. In order to prove (iv), from the KKT conditions we know that p Dk min (qk ) = [x Dk min ; rdk min ] satisfies xdk min x k = A T D k m rdk min. Then, since pk p Dk min (qk ) 2 = D k x k x Dk min 2 + r k rmin Dk 2, it follows that both A T D k D k m rdk min and rk rmin Dk D k m m tend to zero. Then, as A T D k m rk A T D k m (rk rmin Dk ) + AT D k m rdk min, we get that A T D k m rk goes to zero. Under the hypothesis of (v) we know that p =[ x ; r ] S D m, satisfies p q k 2 = x k x D 2 + r 2 x k x 2 + r 2 for all x k D k m Dm, L D m k sq. Due to the fact that the coordinates of p k+1 =[x k+1 ; r k+1 ] and q k+1 =[x k+1 ;0] are given by the accepted approximation ˆp =[x k+1 ; µ j ], they inherit its properties. Thus, since for all p P, p ˆp is D k -orthogonal to ˆp q k, and in particular p q k 2 = x D x k 2 + r 2 = ˆp q k 2 + p k D k m D ˆp 2, is the minimum k D k

Bounds on the Largest Singular Value of a Matrix and the Convergence of Simultaneous and Block-Iterative Algorithms for Sparse Linear Systems

Bounds on the Largest Singular Value of a Matrix and the Convergence of Simultaneous and Block-Iterative Algorithms for Sparse Linear Systems Bounds on the Largest Singular Value of a Matrix and the Convergence of Simultaneous and Block-Iterative Algorithms for Sparse Linear Systems Charles Byrne (Charles Byrne@uml.edu) Department of Mathematical

More information

3. Vector spaces 3.1 Linear dependence and independence 3.2 Basis and dimension. 5. Extreme points and basic feasible solutions

3. Vector spaces 3.1 Linear dependence and independence 3.2 Basis and dimension. 5. Extreme points and basic feasible solutions A. LINEAR ALGEBRA. CONVEX SETS 1. Matrices and vectors 1.1 Matrix operations 1.2 The rank of a matrix 2. Systems of linear equations 2.1 Basic solutions 3. Vector spaces 3.1 Linear dependence and independence

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

Weaker assumptions for convergence of extended block Kaczmarz and Jacobi projection algorithms

Weaker assumptions for convergence of extended block Kaczmarz and Jacobi projection algorithms DOI: 10.1515/auom-2017-0004 An. Şt. Univ. Ovidius Constanţa Vol. 25(1),2017, 49 60 Weaker assumptions for convergence of extended block Kaczmarz and Jacobi projection algorithms Doina Carp, Ioana Pomparău,

More information

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012

Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 Instructions Preliminary/Qualifying Exam in Numerical Analysis (Math 502a) Spring 2012 The exam consists of four problems, each having multiple parts. You should attempt to solve all four problems. 1.

More information

September 2, Revised: May 29, 2007

September 2, Revised: May 29, 2007 ON DIAGONALLY-RELAXED ORTHOGONAL PROJECTION METHODS YAIR CENSOR, TOMMY ELFVING, GABOR T. HERMAN, AND TOURAJ NIKAZAD September 2, 2005. Revised: May 29, 2007 Abstract. We propose and study a block-iterative

More information

Review of Linear Algebra

Review of Linear Algebra Review of Linear Algebra Definitions An m n (read "m by n") matrix, is a rectangular array of entries, where m is the number of rows and n the number of columns. 2 Definitions (Con t) A is square if m=

More information

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization Roger Behling a, Clovis Gonzaga b and Gabriel Haeser c March 21, 2013 a Department

More information

Jim Lambers MAT 610 Summer Session Lecture 1 Notes

Jim Lambers MAT 610 Summer Session Lecture 1 Notes Jim Lambers MAT 60 Summer Session 2009-0 Lecture Notes Introduction This course is about numerical linear algebra, which is the study of the approximate solution of fundamental problems from linear algebra

More information

Conjugate Gradient (CG) Method

Conjugate Gradient (CG) Method Conjugate Gradient (CG) Method by K. Ozawa 1 Introduction In the series of this lecture, I will introduce the conjugate gradient method, which solves efficiently large scale sparse linear simultaneous

More information

CS 246 Review of Linear Algebra 01/17/19

CS 246 Review of Linear Algebra 01/17/19 1 Linear algebra In this section we will discuss vectors and matrices. We denote the (i, j)th entry of a matrix A as A ij, and the ith entry of a vector as v i. 1.1 Vectors and vector operations A vector

More information

Convergence analysis for column-action methods in image reconstruction

Convergence analysis for column-action methods in image reconstruction Downloaded from orbit.dtu.dk on: Oct 11, 2018 Convergence analysis for column-action methods in image reconstruction Elfving, Tommy; Hansen, Per Christian; Nikazad, Touraj Published in: Numerical Algorithms

More information

Superiorized Inversion of the Radon Transform

Superiorized Inversion of the Radon Transform Superiorized Inversion of the Radon Transform Gabor T. Herman Graduate Center, City University of New York March 28, 2017 The Radon Transform in 2D For a function f of two real variables, a real number

More information

EE731 Lecture Notes: Matrix Computations for Signal Processing

EE731 Lecture Notes: Matrix Computations for Signal Processing EE731 Lecture Notes: Matrix Computations for Signal Processing James P. Reilly c Department of Electrical and Computer Engineering McMaster University September 22, 2005 0 Preface This collection of ten

More information

Chapter 7 Iterative Techniques in Matrix Algebra

Chapter 7 Iterative Techniques in Matrix Algebra Chapter 7 Iterative Techniques in Matrix Algebra Per-Olof Persson persson@berkeley.edu Department of Mathematics University of California, Berkeley Math 128B Numerical Analysis Vector Norms Definition

More information

ELE/MCE 503 Linear Algebra Facts Fall 2018

ELE/MCE 503 Linear Algebra Facts Fall 2018 ELE/MCE 503 Linear Algebra Facts Fall 2018 Fact N.1 A set of vectors is linearly independent if and only if none of the vectors in the set can be written as a linear combination of the others. Fact N.2

More information

On fast trust region methods for quadratic models with linear constraints. M.J.D. Powell

On fast trust region methods for quadratic models with linear constraints. M.J.D. Powell DAMTP 2014/NA02 On fast trust region methods for quadratic models with linear constraints M.J.D. Powell Abstract: Quadratic models Q k (x), x R n, of the objective function F (x), x R n, are used by many

More information

Numerical optimization

Numerical optimization Numerical optimization Lecture 4 Alexander & Michael Bronstein tosca.cs.technion.ac.il/book Numerical geometry of non-rigid shapes Stanford University, Winter 2009 2 Longest Slowest Shortest Minimal Maximal

More information

, b = 0. (2) 1 2 The eigenvectors of A corresponding to the eigenvalues λ 1 = 1, λ 2 = 3 are

, b = 0. (2) 1 2 The eigenvectors of A corresponding to the eigenvalues λ 1 = 1, λ 2 = 3 are Quadratic forms We consider the quadratic function f : R 2 R defined by f(x) = 2 xt Ax b T x with x = (x, x 2 ) T, () where A R 2 2 is symmetric and b R 2. We will see that, depending on the eigenvalues

More information

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October

Finding normalized and modularity cuts by spectral clustering. Ljubjana 2010, October Finding normalized and modularity cuts by spectral clustering Marianna Bolla Institute of Mathematics Budapest University of Technology and Economics marib@math.bme.hu Ljubjana 2010, October Outline Find

More information

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems

Numerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems 1 Numerical optimization Alexander & Michael Bronstein, 2006-2009 Michael Bronstein, 2010 tosca.cs.technion.ac.il/book Numerical optimization 048921 Advanced topics in vision Processing and Analysis of

More information

Lecture 2: Linear Algebra Review

Lecture 2: Linear Algebra Review EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1

More information

Absolute value equations

Absolute value equations Linear Algebra and its Applications 419 (2006) 359 367 www.elsevier.com/locate/laa Absolute value equations O.L. Mangasarian, R.R. Meyer Computer Sciences Department, University of Wisconsin, 1210 West

More information

Chapter 3 Transformations

Chapter 3 Transformations Chapter 3 Transformations An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Linear Transformations A function is called a linear transformation if 1. for every and 2. for every If we fix the bases

More information

Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms

Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms Some Inexact Hybrid Proximal Augmented Lagrangian Algorithms Carlos Humes Jr. a, Benar F. Svaiter b, Paulo J. S. Silva a, a Dept. of Computer Science, University of São Paulo, Brazil Email: {humes,rsilva}@ime.usp.br

More information

3 (Maths) Linear Algebra

3 (Maths) Linear Algebra 3 (Maths) Linear Algebra References: Simon and Blume, chapters 6 to 11, 16 and 23; Pemberton and Rau, chapters 11 to 13 and 25; Sundaram, sections 1.3 and 1.5. The methods and concepts of linear algebra

More information

Numerical Methods I Non-Square and Sparse Linear Systems

Numerical Methods I Non-Square and Sparse Linear Systems Numerical Methods I Non-Square and Sparse Linear Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 September 25th, 2014 A. Donev (Courant

More information

Error estimates for the Raviart-Thomas interpolation under the maximum angle condition

Error estimates for the Raviart-Thomas interpolation under the maximum angle condition Error estimates for the Raviart-Thomas interpolation under the maximum angle condition Ricardo G. Durán and Ariel L. Lombardi Abstract. The classical error analysis for the Raviart-Thomas interpolation

More information

Lecture 5 : Projections

Lecture 5 : Projections Lecture 5 : Projections EE227C. Lecturer: Professor Martin Wainwright. Scribe: Alvin Wan Up until now, we have seen convergence rates of unconstrained gradient descent. Now, we consider a constrained minimization

More information

USE OF THE INTERIOR-POINT METHOD FOR CORRECTING AND SOLVING INCONSISTENT LINEAR INEQUALITY SYSTEMS

USE OF THE INTERIOR-POINT METHOD FOR CORRECTING AND SOLVING INCONSISTENT LINEAR INEQUALITY SYSTEMS An. Şt. Univ. Ovidius Constanţa Vol. 9(2), 2001, 121 128 USE OF THE INTERIOR-POINT METHOD FOR CORRECTING AND SOLVING INCONSISTENT LINEAR INEQUALITY SYSTEMS Elena Popescu Dedicated to Professor Mirela Ştefănescu

More information

Weaker hypotheses for the general projection algorithm with corrections

Weaker hypotheses for the general projection algorithm with corrections DOI: 10.1515/auom-2015-0043 An. Şt. Univ. Ovidius Constanţa Vol. 23(3),2015, 9 16 Weaker hypotheses for the general projection algorithm with corrections Alexru Bobe, Aurelian Nicola, Constantin Popa Abstract

More information

Conditions for Robust Principal Component Analysis

Conditions for Robust Principal Component Analysis Rose-Hulman Undergraduate Mathematics Journal Volume 12 Issue 2 Article 9 Conditions for Robust Principal Component Analysis Michael Hornstein Stanford University, mdhornstein@gmail.com Follow this and

More information

Alternating Minimization and Alternating Projection Algorithms: A Tutorial

Alternating Minimization and Alternating Projection Algorithms: A Tutorial Alternating Minimization and Alternating Projection Algorithms: A Tutorial Charles L. Byrne Department of Mathematical Sciences University of Massachusetts Lowell Lowell, MA 01854 March 27, 2011 Abstract

More information

Iterative Methods for Solving A x = b

Iterative Methods for Solving A x = b Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http

More information

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be

More information

Notes on basis changes and matrix diagonalization

Notes on basis changes and matrix diagonalization Notes on basis changes and matrix diagonalization Howard E Haber Santa Cruz Institute for Particle Physics, University of California, Santa Cruz, CA 95064 April 17, 2017 1 Coordinates of vectors and matrix

More information

15 Singular Value Decomposition

15 Singular Value Decomposition 15 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

A convergence result for an Outer Approximation Scheme

A convergence result for an Outer Approximation Scheme A convergence result for an Outer Approximation Scheme R. S. Burachik Engenharia de Sistemas e Computação, COPPE-UFRJ, CP 68511, Rio de Janeiro, RJ, CEP 21941-972, Brazil regi@cos.ufrj.br J. O. Lopes Departamento

More information

Linear Systems. Carlo Tomasi. June 12, r = rank(a) b range(a) n r solutions

Linear Systems. Carlo Tomasi. June 12, r = rank(a) b range(a) n r solutions Linear Systems Carlo Tomasi June, 08 Section characterizes the existence and multiplicity of the solutions of a linear system in terms of the four fundamental spaces associated with the system s matrix

More information

Linear programming: theory, algorithms and applications

Linear programming: theory, algorithms and applications Linear programming: theory, algorithms and applications illes@math.bme.hu Department of Differential Equations Budapest 2014/2015 Fall Vector spaces A nonempty set L equipped with addition and multiplication

More information

A Note on Inverse Iteration

A Note on Inverse Iteration A Note on Inverse Iteration Klaus Neymeyr Universität Rostock, Fachbereich Mathematik, Universitätsplatz 1, 18051 Rostock, Germany; SUMMARY Inverse iteration, if applied to a symmetric positive definite

More information

Knowledge Discovery and Data Mining 1 (VO) ( )

Knowledge Discovery and Data Mining 1 (VO) ( ) Knowledge Discovery and Data Mining 1 (VO) (707.003) Review of Linear Algebra Denis Helic KTI, TU Graz Oct 9, 2014 Denis Helic (KTI, TU Graz) KDDM1 Oct 9, 2014 1 / 74 Big picture: KDDM Probability Theory

More information

(1) for all (2) for all and all

(1) for all (2) for all and all 8. Linear mappings and matrices A mapping f from IR n to IR m is called linear if it fulfills the following two properties: (1) for all (2) for all and all Mappings of this sort appear frequently in the

More information

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012. Math 5620 - Introduction to Numerical Analysis - Class Notes Fernando Guevara Vasquez Version 1990. Date: January 17, 2012. 3 Contents 1. Disclaimer 4 Chapter 1. Iterative methods for solving linear systems

More information

Set, functions and Euclidean space. Seungjin Han

Set, functions and Euclidean space. Seungjin Han Set, functions and Euclidean space Seungjin Han September, 2018 1 Some Basics LOGIC A is necessary for B : If B holds, then A holds. B A A B is the contraposition of B A. A is sufficient for B: If A holds,

More information

The Newton Bracketing method for the minimization of convex functions subject to affine constraints

The Newton Bracketing method for the minimization of convex functions subject to affine constraints Discrete Applied Mathematics 156 (2008) 1977 1987 www.elsevier.com/locate/dam The Newton Bracketing method for the minimization of convex functions subject to affine constraints Adi Ben-Israel a, Yuri

More information

Copositive Plus Matrices

Copositive Plus Matrices Copositive Plus Matrices Willemieke van Vliet Master Thesis in Applied Mathematics October 2011 Copositive Plus Matrices Summary In this report we discuss the set of copositive plus matrices and their

More information

Interior-Point Methods for Linear Optimization

Interior-Point Methods for Linear Optimization Interior-Point Methods for Linear Optimization Robert M. Freund and Jorge Vera March, 204 c 204 Robert M. Freund and Jorge Vera. All rights reserved. Linear Optimization with a Logarithmic Barrier Function

More information

A derivative-free nonmonotone line search and its application to the spectral residual method

A derivative-free nonmonotone line search and its application to the spectral residual method IMA Journal of Numerical Analysis (2009) 29, 814 825 doi:10.1093/imanum/drn019 Advance Access publication on November 14, 2008 A derivative-free nonmonotone line search and its application to the spectral

More information

Principal Component Analysis

Principal Component Analysis Machine Learning Michaelmas 2017 James Worrell Principal Component Analysis 1 Introduction 1.1 Goals of PCA Principal components analysis (PCA) is a dimensionality reduction technique that can be used

More information

Linear Systems. Carlo Tomasi

Linear Systems. Carlo Tomasi Linear Systems Carlo Tomasi Section 1 characterizes the existence and multiplicity of the solutions of a linear system in terms of the four fundamental spaces associated with the system s matrix and of

More information

Relation of Pure Minimum Cost Flow Model to Linear Programming

Relation of Pure Minimum Cost Flow Model to Linear Programming Appendix A Page 1 Relation of Pure Minimum Cost Flow Model to Linear Programming The Network Model The network pure minimum cost flow model has m nodes. The external flows given by the vector b with m

More information

5 Handling Constraints

5 Handling Constraints 5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest

More information

(v, w) = arccos( < v, w >

(v, w) = arccos( < v, w > MA322 F all203 Notes on Inner Products Notes on Chapter 6 Inner product. Given a real vector space V, an inner product is defined to be a bilinear map F : V V R such that the following holds: For all v,

More information

Course Notes: Week 1

Course Notes: Week 1 Course Notes: Week 1 Math 270C: Applied Numerical Linear Algebra 1 Lecture 1: Introduction (3/28/11) We will focus on iterative methods for solving linear systems of equations (and some discussion of eigenvalues

More information

Introduction to Compressed Sensing

Introduction to Compressed Sensing Introduction to Compressed Sensing Alejandro Parada, Gonzalo Arce University of Delaware August 25, 2016 Motivation: Classical Sampling 1 Motivation: Classical Sampling Issues Some applications Radar Spectral

More information

Numerical Linear Algebra

Numerical Linear Algebra Numerical Linear Algebra Decompositions, numerical aspects Gerard Sleijpen and Martin van Gijzen September 27, 2017 1 Delft University of Technology Program Lecture 2 LU-decomposition Basic algorithm Cost

More information

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects

Program Lecture 2. Numerical Linear Algebra. Gaussian elimination (2) Gaussian elimination. Decompositions, numerical aspects Numerical Linear Algebra Decompositions, numerical aspects Program Lecture 2 LU-decomposition Basic algorithm Cost Stability Pivoting Cholesky decomposition Sparse matrices and reorderings Gerard Sleijpen

More information

MAT 610: Numerical Linear Algebra. James V. Lambers

MAT 610: Numerical Linear Algebra. James V. Lambers MAT 610: Numerical Linear Algebra James V Lambers January 16, 2017 2 Contents 1 Matrix Multiplication Problems 7 11 Introduction 7 111 Systems of Linear Equations 7 112 The Eigenvalue Problem 8 12 Basic

More information

Glossary of Linear Algebra Terms. Prepared by Vince Zaccone For Campus Learning Assistance Services at UCSB

Glossary of Linear Algebra Terms. Prepared by Vince Zaccone For Campus Learning Assistance Services at UCSB Glossary of Linear Algebra Terms Basis (for a subspace) A linearly independent set of vectors that spans the space Basic Variable A variable in a linear system that corresponds to a pivot column in the

More information

G1110 & 852G1 Numerical Linear Algebra

G1110 & 852G1 Numerical Linear Algebra The University of Sussex Department of Mathematics G & 85G Numerical Linear Algebra Lecture Notes Autumn Term Kerstin Hesse (w aw S w a w w (w aw H(wa = (w aw + w Figure : Geometric explanation of the

More information

The Nearest Doubly Stochastic Matrix to a Real Matrix with the same First Moment

The Nearest Doubly Stochastic Matrix to a Real Matrix with the same First Moment he Nearest Doubly Stochastic Matrix to a Real Matrix with the same First Moment William Glunt 1, homas L. Hayden 2 and Robert Reams 2 1 Department of Mathematics and Computer Science, Austin Peay State

More information

Linear Algebra Massoud Malek

Linear Algebra Massoud Malek CSUEB Linear Algebra Massoud Malek Inner Product and Normed Space In all that follows, the n n identity matrix is denoted by I n, the n n zero matrix by Z n, and the zero vector by θ n An inner product

More information

Introduction to Quantitative Techniques for MSc Programmes SCHOOL OF ECONOMICS, MATHEMATICS AND STATISTICS MALET STREET LONDON WC1E 7HX

Introduction to Quantitative Techniques for MSc Programmes SCHOOL OF ECONOMICS, MATHEMATICS AND STATISTICS MALET STREET LONDON WC1E 7HX Introduction to Quantitative Techniques for MSc Programmes SCHOOL OF ECONOMICS, MATHEMATICS AND STATISTICS MALET STREET LONDON WC1E 7HX September 2007 MSc Sep Intro QT 1 Who are these course for? The September

More information

9. The determinant. Notation: Also: A matrix, det(a) = A IR determinant of A. Calculation in the special cases n = 2 and n = 3:

9. The determinant. Notation: Also: A matrix, det(a) = A IR determinant of A. Calculation in the special cases n = 2 and n = 3: 9. The determinant The determinant is a function (with real numbers as values) which is defined for square matrices. It allows to make conclusions about the rank and appears in diverse theorems and formulas.

More information

14 Singular Value Decomposition

14 Singular Value Decomposition 14 Singular Value Decomposition For any high-dimensional data analysis, one s first thought should often be: can I use an SVD? The singular value decomposition is an invaluable analysis tool for dealing

More information

b 1 b 2.. b = b m A = [a 1,a 2,...,a n ] where a 1,j a 2,j a j = a m,j Let A R m n and x 1 x 2 x = x n

b 1 b 2.. b = b m A = [a 1,a 2,...,a n ] where a 1,j a 2,j a j = a m,j Let A R m n and x 1 x 2 x = x n Lectures -2: Linear Algebra Background Almost all linear and nonlinear problems in scientific computation require the use of linear algebra These lectures review basic concepts in a way that has proven

More information

Optimality Conditions for Constrained Optimization

Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

More information

Properties of Matrices and Operations on Matrices

Properties of Matrices and Operations on Matrices Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,

More information

Least Squares Optimization

Least Squares Optimization Least Squares Optimization The following is a brief review of least squares optimization and constrained optimization techniques. Broadly, these techniques can be used in data analysis and visualization

More information

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian

2. Linear algebra. matrices and vectors. linear equations. range and nullspace of matrices. function of vectors, gradient and Hessian FE661 - Statistical Methods for Financial Engineering 2. Linear algebra Jitkomut Songsiri matrices and vectors linear equations range and nullspace of matrices function of vectors, gradient and Hessian

More information

Assignment 1 Math 5341 Linear Algebra Review. Give complete answers to each of the following questions. Show all of your work.

Assignment 1 Math 5341 Linear Algebra Review. Give complete answers to each of the following questions. Show all of your work. Assignment 1 Math 5341 Linear Algebra Review Give complete answers to each of the following questions Show all of your work Note: You might struggle with some of these questions, either because it has

More information

AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES

AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES AN ELEMENTARY PROOF OF THE SPECTRAL RADIUS FORMULA FOR MATRICES JOEL A. TROPP Abstract. We present an elementary proof that the spectral radius of a matrix A may be obtained using the formula ρ(a) lim

More information

(v, w) = arccos( < v, w >

(v, w) = arccos( < v, w > MA322 Sathaye Notes on Inner Products Notes on Chapter 6 Inner product. Given a real vector space V, an inner product is defined to be a bilinear map F : V V R such that the following holds: For all v

More information

文件中找不不到关系 ID 为 rid3 的图像部件. Review of LINEAR ALGEBRA I. TA: Yujia Xie CSE 6740 Computational Data Analysis. Georgia Institute of Technology

文件中找不不到关系 ID 为 rid3 的图像部件. Review of LINEAR ALGEBRA I. TA: Yujia Xie CSE 6740 Computational Data Analysis. Georgia Institute of Technology 文件中找不不到关系 ID 为 rid3 的图像部件 Review of LINEAR ALGEBRA I TA: Yujia Xie CSE 6740 Computational Data Analysis Georgia Institute of Technology 1. Notations A R $ & : a matrix with m rows and n columns, where

More information

(v, w) = arccos( < v, w >

(v, w) = arccos( < v, w > MA322 F all206 Notes on Inner Products Notes on Chapter 6 Inner product. Given a real vector space V, an inner product is defined to be a bilinear map F : V V R such that the following holds: Commutativity:

More information

RESEARCH ARTICLE. A strategy of finding an initial active set for inequality constrained quadratic programming problems

RESEARCH ARTICLE. A strategy of finding an initial active set for inequality constrained quadratic programming problems Optimization Methods and Software Vol. 00, No. 00, July 200, 8 RESEARCH ARTICLE A strategy of finding an initial active set for inequality constrained quadratic programming problems Jungho Lee Computer

More information

Robust l 1 and l Solutions of Linear Inequalities

Robust l 1 and l Solutions of Linear Inequalities Available at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Applications and Applied Mathematics: An International Journal (AAM) Vol. 6, Issue 2 (December 2011), pp. 522-528 Robust l 1 and l Solutions

More information

AMS526: Numerical Analysis I (Numerical Linear Algebra)

AMS526: Numerical Analysis I (Numerical Linear Algebra) AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 1: Course Overview & Matrix-Vector Multiplication Xiangmin Jiao SUNY Stony Brook Xiangmin Jiao Numerical Analysis I 1 / 20 Outline 1 Course

More information

7. Symmetric Matrices and Quadratic Forms

7. Symmetric Matrices and Quadratic Forms Linear Algebra 7. Symmetric Matrices and Quadratic Forms CSIE NCU 1 7. Symmetric Matrices and Quadratic Forms 7.1 Diagonalization of symmetric matrices 2 7.2 Quadratic forms.. 9 7.4 The singular value

More information

Journal of Symbolic Computation. On the Berlekamp/Massey algorithm and counting singular Hankel matrices over a finite field

Journal of Symbolic Computation. On the Berlekamp/Massey algorithm and counting singular Hankel matrices over a finite field Journal of Symbolic Computation 47 (2012) 480 491 Contents lists available at SciVerse ScienceDirect Journal of Symbolic Computation journal homepage: wwwelseviercom/locate/jsc On the Berlekamp/Massey

More information

Variable Selection for Highly Correlated Predictors

Variable Selection for Highly Correlated Predictors Variable Selection for Highly Correlated Predictors Fei Xue and Annie Qu arxiv:1709.04840v1 [stat.me] 14 Sep 2017 Abstract Penalty-based variable selection methods are powerful in selecting relevant covariates

More information

MATH 2331 Linear Algebra. Section 1.1 Systems of Linear Equations. Finding the solution to a set of two equations in two variables: Example 1: Solve:

MATH 2331 Linear Algebra. Section 1.1 Systems of Linear Equations. Finding the solution to a set of two equations in two variables: Example 1: Solve: MATH 2331 Linear Algebra Section 1.1 Systems of Linear Equations Finding the solution to a set of two equations in two variables: Example 1: Solve: x x = 3 1 2 2x + 4x = 12 1 2 Geometric meaning: Do these

More information

Stochastic Design Criteria in Linear Models

Stochastic Design Criteria in Linear Models AUSTRIAN JOURNAL OF STATISTICS Volume 34 (2005), Number 2, 211 223 Stochastic Design Criteria in Linear Models Alexander Zaigraev N. Copernicus University, Toruń, Poland Abstract: Within the framework

More information

Lecture 1 Systems of Linear Equations and Matrices

Lecture 1 Systems of Linear Equations and Matrices Lecture 1 Systems of Linear Equations and Matrices Math 19620 Outline of Course Linear Equations and Matrices Linear Transformations, Inverses Bases, Linear Independence, Subspaces Abstract Vector Spaces

More information

Chapter Two Elements of Linear Algebra

Chapter Two Elements of Linear Algebra Chapter Two Elements of Linear Algebra Previously, in chapter one, we have considered single first order differential equations involving a single unknown function. In the next chapter we will begin to

More information

Methods for sparse analysis of high-dimensional data, II

Methods for sparse analysis of high-dimensional data, II Methods for sparse analysis of high-dimensional data, II Rachel Ward May 26, 2011 High dimensional data with low-dimensional structure 300 by 300 pixel images = 90, 000 dimensions 2 / 55 High dimensional

More information

A Randomized Nonmonotone Block Proximal Gradient Method for a Class of Structured Nonlinear Programming

A Randomized Nonmonotone Block Proximal Gradient Method for a Class of Structured Nonlinear Programming A Randomized Nonmonotone Block Proximal Gradient Method for a Class of Structured Nonlinear Programming Zhaosong Lu Lin Xiao March 9, 2015 (Revised: May 13, 2016; December 30, 2016) Abstract We propose

More information

2 Two-Point Boundary Value Problems

2 Two-Point Boundary Value Problems 2 Two-Point Boundary Value Problems Another fundamental equation, in addition to the heat eq. and the wave eq., is Poisson s equation: n j=1 2 u x 2 j The unknown is the function u = u(x 1, x 2,..., x

More information

Vector bundles in Algebraic Geometry Enrique Arrondo. 1. The notion of vector bundle

Vector bundles in Algebraic Geometry Enrique Arrondo. 1. The notion of vector bundle Vector bundles in Algebraic Geometry Enrique Arrondo Notes(* prepared for the First Summer School on Complex Geometry (Villarrica, Chile 7-9 December 2010 1 The notion of vector bundle In affine geometry,

More information

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method Robert M. Freund March, 2004 2004 Massachusetts Institute of Technology. The Problem The logarithmic barrier approach

More information

Conceptual Questions for Review

Conceptual Questions for Review Conceptual Questions for Review Chapter 1 1.1 Which vectors are linear combinations of v = (3, 1) and w = (4, 3)? 1.2 Compare the dot product of v = (3, 1) and w = (4, 3) to the product of their lengths.

More information

A strongly polynomial algorithm for linear systems having a binary solution

A strongly polynomial algorithm for linear systems having a binary solution A strongly polynomial algorithm for linear systems having a binary solution Sergei Chubanov Institute of Information Systems at the University of Siegen, Germany e-mail: sergei.chubanov@uni-siegen.de 7th

More information

Math 302 Outcome Statements Winter 2013

Math 302 Outcome Statements Winter 2013 Math 302 Outcome Statements Winter 2013 1 Rectangular Space Coordinates; Vectors in the Three-Dimensional Space (a) Cartesian coordinates of a point (b) sphere (c) symmetry about a point, a line, and a

More information

MATH36001 Perron Frobenius Theory 2015

MATH36001 Perron Frobenius Theory 2015 MATH361 Perron Frobenius Theory 215 In addition to saying something useful, the Perron Frobenius theory is elegant. It is a testament to the fact that beautiful mathematics eventually tends to be useful,

More information

Final Review Written by Victoria Kala SH 6432u Office Hours R 12:30 1:30pm Last Updated 11/30/2015

Final Review Written by Victoria Kala SH 6432u Office Hours R 12:30 1:30pm Last Updated 11/30/2015 Final Review Written by Victoria Kala vtkala@mathucsbedu SH 6432u Office Hours R 12:30 1:30pm Last Updated 11/30/2015 Summary This review contains notes on sections 44 47, 51 53, 61, 62, 65 For your final,

More information

a s 1.3 Matrix Multiplication. Know how to multiply two matrices and be able to write down the formula

a s 1.3 Matrix Multiplication. Know how to multiply two matrices and be able to write down the formula Syllabus for Math 308, Paul Smith Book: Kolman-Hill Chapter 1. Linear Equations and Matrices 1.1 Systems of Linear Equations Definition of a linear equation and a solution to a linear equations. Meaning

More information

Review of Basic Concepts in Linear Algebra

Review of Basic Concepts in Linear Algebra Review of Basic Concepts in Linear Algebra Grady B Wright Department of Mathematics Boise State University September 7, 2017 Math 565 Linear Algebra Review September 7, 2017 1 / 40 Numerical Linear Algebra

More information

Markov Decision Processes

Markov Decision Processes Markov Decision Processes Lecture notes for the course Games on Graphs B. Srivathsan Chennai Mathematical Institute, India 1 Markov Chains We will define Markov chains in a manner that will be useful to

More information

Example Linear Algebra Competency Test

Example Linear Algebra Competency Test Example Linear Algebra Competency Test The 4 questions below are a combination of True or False, multiple choice, fill in the blank, and computations involving matrices and vectors. In the latter case,

More information