Linear algebra issues in Interior Point methods for bound-constrained least-squares problems

Similar documents
This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and

Key words. convex quadratic programming, degeneracy, Interior Point methods, Inexact Newton methods, global convergence.

Trust-region methods for rectangular systems of nonlinear equations

Exploiting hyper-sparsity when computing preconditioners for conjugate gradients in interior point methods

Indefinite Preconditioners for PDE-constrained optimization problems. V. Simoncini

Part 4: Active-set methods for linearly constrained optimization. Nick Gould (RAL)

The amount of work to construct each new guess from the previous one should be a small multiple of the number of nonzeros in A.

Convergence Analysis of Inexact Infeasible Interior Point Method. for Linear Optimization

Trust-Region SQP Methods with Inexact Linear System Solves for Large-Scale Optimization

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Key words. preconditioned conjugate gradient method, saddle point problems, optimal control of PDEs, control and state constraints, multigrid method

MS&E 318 (CME 338) Large-Scale Numerical Optimization

Linear Algebra Massoud Malek

Spectral Properties of Saddle Point Linear Systems and Relations to Iterative Solvers Part I: Spectral Properties. V. Simoncini

Part 5: Penalty and augmented Lagrangian methods for equality constrained optimization. Nick Gould (RAL)

Experiments with iterative computation of search directions within interior methods for constrained optimization

Lecture Note 7: Iterative methods for solving linear systems. Xiaoqun Zhang Shanghai Jiao Tong University

Preconditioned inverse iteration and shift-invert Arnoldi method

MS&E 318 (CME 338) Large-Scale Numerical Optimization

A PRECONDITIONER FOR LINEAR SYSTEMS ARISING FROM INTERIOR POINT OPTIMIZATION METHODS

Interior-Point Methods as Inexact Newton Methods. Silvia Bonettini Università di Modena e Reggio Emilia Italy

AMS526: Numerical Analysis I (Numerical Linear Algebra) Lecture 23: GMRES and Other Krylov Subspace Methods; Preconditioning

A Constraint-Reduced MPC Algorithm for Convex Quadratic Programming, with a Modified Active-Set Identification Scheme

Solving Symmetric Indefinite Systems with Symmetric Positive Definite Preconditioners

An Inexact Cayley Transform Method For Inverse Eigenvalue Problems

An Inexact Sequential Quadratic Optimization Method for Nonlinear Optimization

Iterative Methods for Solving A x = b

A Review of Preconditioning Techniques for Steady Incompressible Flow

Key words. convex quadratic programming, interior point methods, KKT systems, preconditioners.

Linear Solvers. Andrew Hazel

Mathematical foundations - linear algebra

9.1 Preconditioned Krylov Subspace Methods

Numerical Methods I Non-Square and Sparse Linear Systems

A Distributed Newton Method for Network Utility Maximization, II: Convergence

A Tuned Preconditioner for Inexact Inverse Iteration Applied to Hermitian Eigenvalue Problems

Line Search Methods for Unconstrained Optimisation

Multigrid absolute value preconditioning

The antitriangular factorisation of saddle point matrices

Dipartimento di Energetica Sergio Stecco, Università di Firenze, Pubblicazione n. 2/2011.

Fast Iterative Solution of Saddle Point Problems

IP-PCG An interior point algorithm for nonlinear constrained optimization

ITERATIVE METHODS BASED ON KRYLOV SUBSPACES

Politecnico di Torino. Porto Institutional Repository

Numerical behavior of inexact linear solvers

Bindel, Fall 2016 Matrix Computations (CS 6210) Notes for

On the Local Convergence of an Iterative Approach for Inverse Singular Value Problems

M.A. Botchev. September 5, 2014

A DECOMPOSITION PROCEDURE BASED ON APPROXIMATE NEWTON DIRECTIONS

Course Notes: Week 1

On affine scaling inexact dogleg methods for bound-constrained nonlinear systems

Residual iterative schemes for largescale linear systems

Inexact Newton Methods and Nonlinear Constrained Optimization

Keywords: Nonlinear least-squares problems, regularized models, error bound condition, local convergence.

Elementary maths for GMT

Applied Mathematics 205. Unit V: Eigenvalue Problems. Lecturer: Dr. David Knezevic

Efficient Solvers for the Navier Stokes Equations in Rotation Form

Recent advances in approximation using Krylov subspaces. V. Simoncini. Dipartimento di Matematica, Università di Bologna.

FEM and sparse linear system solving

Iterative solvers for saddle point algebraic linear systems: tools of the trade. V. Simoncini

INCOMPLETE FACTORIZATION CONSTRAINT PRECONDITIONERS FOR SADDLE-POINT MATRICES

Jae Heon Yun and Yu Du Han

The convergence of stationary iterations with indefinite splitting

1. Introduction. We analyze a trust region version of Newton s method for the optimization problem

AMS526: Numerical Analysis I (Numerical Linear Algebra)

Iterative Methods. Splitting Methods

Notes on PCG for Sparse Linear Systems

Stabilization and Acceleration of Algebraic Multigrid Method

MATH 2331 Linear Algebra. Section 2.1 Matrix Operations. Definition: A : m n, B : n p. Example: Compute AB, if possible.

THE solution of the absolute value equation (AVE) of

Chapter 7 Iterative Techniques in Matrix Algebra

Incomplete Cholesky preconditioners that exploit the low-rank property

Nonlinear Optimization for Optimal Control

Chap 3. Linear Algebra

SEMI-CONVERGENCE ANALYSIS OF THE INEXACT UZAWA METHOD FOR SINGULAR SADDLE POINT PROBLEMS

MATH 4211/6211 Optimization Basics of Optimization Problems

An Accelerated Block-Parallel Newton Method via Overlapped Partitioning

Chapter 3 Transformations

Second-order cone programming

AMS Mathematics Subject Classification : 65F10,65F50. Key words and phrases: ILUS factorization, preconditioning, Schur complement, 1.

Math 102, Winter Final Exam Review. Chapter 1. Matrices and Gaussian Elimination

Conjugate Gradient (CG) Method

Preconditioning for Nonsymmetry and Time-dependence

7.4 The Saddle Point Stokes Problem

Indefinite and physics-based preconditioning

Chapter 4. Unconstrained optimization

Background Mathematics (2/2) 1. David Barber

4.6 Iterative Solvers for Linear Systems

The Conjugate Gradient Method

Self-Concordant Barrier Functions for Convex Optimization

DIPARTIMENTO DI MATEMATICA

Projection methods to solve SDP

UPDATING CONSTRAINT PRECONDITIONERS FOR KKT SYSTEMS IN QUADRATIC PROGRAMMING VIA LOW-RANK CORRECTIONS

An Introduction to Algebraic Multigrid (AMG) Algorithms Derrick Cerwinsky and Craig C. Douglas 1/84

ON AUGMENTED LAGRANGIAN METHODS FOR SADDLE-POINT LINEAR SYSTEMS WITH SINGULAR OR SEMIDEFINITE (1,1) BLOCKS * 1. Introduction

RESEARCH ARTICLE. A strategy of finding an initial active set for inequality constrained quadratic programming problems

AMS526: Numerical Analysis I (Numerical Linear Algebra for Computational and Data Sciences)

Math Introduction to Numerical Analysis - Class Notes. Fernando Guevara Vasquez. Version Date: January 17, 2012.

Part 3: Trust-region methods for unconstrained optimization. Nick Gould (RAL)

Lecture 11: CMSC 878R/AMSC698R. Iterative Methods An introduction. Outline. Inverse, LU decomposition, Cholesky, SVD, etc.

Math Matrix Algebra

Transcription:

Linear algebra issues in Interior Point methods for bound-constrained least-squares problems Stefania Bellavia Dipartimento di Energetica S. Stecco Università degli Studi di Firenze Joint work with Jacek Gondzio, and Benedetta Morini, Bologna, 6th-7th Mar 2008

Outline Introduction 1 Introduction The problem The Inexact Interior Point Framework Focus on the Linear Algebra Phase 2 3 Iterative Linear Algebra The preconditioner Spectral properties PPCG 4

The problem The Inexact Interior Point Framework Focus on the Linear Algebra Phase Bound Constrained Least-Squares Problems min l x u q(x) = 1 2 Ax b 2 2 + µ x 2 Vectors l (IR ) n and u (IR ) n are lower and upper bounds on the variables. A IR m n, b IR m, µ 0 are given and m n. We expect A to be large and sparse. We allow the solution x to be degenerate:. x i = l i or x i = u i, q i (x ) = 0, for some i, 1 i n

The problem The Inexact Interior Point Framework Focus on the Linear Algebra Phase We limit the presentation to NNLS problems: min x 0 q(x) = 1 2 Ax b 2 2 We assume A has full column rank there is a unique solution x. Let g(x) = q(x) = A T (Ax b) and D(x) be the diagonal matrix with entries: { xi if g d i (x) = i (x) 0 1 otherwise The core of our procedure is an Inexact Newton-like method applied to the First Order Optimality condition for NNLS: D(x)g(x) = 0

The problem The Inexact Interior Point Framework Focus on the Linear Algebra Phase Inexact Newton Interior Point methods for D(x)g(x) = 0 [Bellavia, Macconi, Morini, NLAA, 2006] The method uses ideas of [Heinkenschloss, Ulbrich, Ulbrich, Math. Progr., 1999] Let E(x) be the diagonal positive semidefinite matrix with entries: { gi (x) if 0 g e i (x) = i (x) < xi 2 or g i (x) 2 > x i 0 otherwise. Let W (x) and S(x) be the diagonal matrices W (x) = (E(x) + D(x)) 1 S(x) = (W (x)d(x)) 1 2 Note that (S(x)) 2 i,i (0, 1] and (W (x)e(x)) i,i [0, 1).

k-th iteration Introduction The problem The Inexact Interior Point Framework Focus on the Linear Algebra Phase Solve the s.p.d. system: Z k p k = S k g k + r k, r k η k W k D k g k where η k [0, 1) and Z k Z(x k ) is given by: Z k = Sk T (AT A + D 1 k E k)s k = Sk T AT AS k + W k E k Form the step p k = S k p k Project it onto an interior of the positive orthant: ˆp k = max{σ, 1 P(x k + p k ) x k } (P(x k + p k ) x k ), where σ (0, 1) is close to one.

The problem The Inexact Interior Point Framework Focus on the Linear Algebra Phase Globalization Phase Set: x k+1 = x k + (1 t)ˆp k + tp C k t [0, 1) where pk C is a constrained Cauchy step. t is chosen to guarantee a sufficient decrease of the objective function q(x). Strictly positive iterates Eventually t = 0 is taken up to quadratic convergence can be obtained without assuming strict complementarity at x.

The problem The Inexact Interior Point Framework Focus on the Linear Algebra Phase The Linear Algebra Phase: normal equations The system Z k p k = S k g k represents the normal equations for the least-squares problem min p IR B δ p + h n with B δ = ( ASk W 1 2 k E 1 2 k ) ( Axk b, h = 0 ).

The problem The Inexact Interior Point Framework Focus on the Linear Algebra Phase The Linear Algebra Phase: augmented system The step p k can be obtained solving: ( ) ( ) ( I ASk qk (Axk b) S k A T = W k E k p k 0 }{{} H δ ) Note that W k E k is positive semidefinite and where 1 > δ = min i (w k e k ) i. v T W k E k v δv T v, v IR n,

Conditioning issues The problem The Inexact Interior Point Framework Focus on the Linear Algebra Phase Let 0 < σ 1 σ 2... σ n, be the singular values of AS k Assume σ 1 << 1. If δ = 0 then κ 2 (H 0 ) 1 + σ n σ 2 1 κ 2 (B 0 ) 1 + σ n σ 1, i.e. κ 2 (H 0 ) may be much greater than κ 2 (B 0 ). If δ > 0 (regularized system), then κ 2 (H δ ) 1 + σ n δ κ 2 (B δ ) 1 + σ n δ, i.e. If δ > σ 1 : κ 2 (H δ ) (κ 2 (B δ )) may be considerably smaller than κ 2 (H 0 ) (κ 2 (B 0 ))

The Regularized I.P. Newton-like method If σ 1 is not small, the regularization does not deteriorate κ 2 (H δ ) with respect to κ 2 (H 0 ). Clear benefit from regularization ( see also [Saunders, BIT, 1995][Silvester and Wathen, SINUM, 1994] ) Modification of the Affine Scaling I.P. method: where Z k p k = S k g k + r k Z k = Sk T (AT A + D 1 k = Sk T AT AS k + W k E k } {{ } Z k and k is diagonal with entries in [0, 1). E k + k )S k 2 + k Sk

Fast convergence of the method is preserved (in presence of degeneracy, too) The globalization strategy of [BMM] can be applied with slight modifications. The least square problem and the augmented system are regularized: ( ) ASk B δ = where C 1 2 k ( ) I ASk H δ = S k A T C k C k = W k E k + k S 2 k

Features of the method Let τ (0, 1) be a small positive threshold and I k = {i {1, 2,..., n}, s.t. (s 2 k ) i 1 τ}, A k = {1, 2,..., n}/i k, n 1 = card(i k ), then S k = diag((s k ) I, (S k ) A ) Note that S 2 k + W ke k = I. When x k converges to x, lim k (S k ) I = I, lim k (S k ) A = 0. lim k (W k E k ) I = 0, lim k (W k E k ) A = I. I k is expected to eventually settle down (inactive components and possibly degenerate components)

Solving the augmented system The preconditioner Spectral properties PPCG The following partition on the augmented system is induced: I A I (S k ) I A A (S k ) A (S k ) I A T I (C k ) I 0 q k ( p k ) I = (Ax k b) 0 (S k ) A A T A 0 (C k ) A ( p k ) A 0 Eliminating ( p k ) A we get ( I + Qk A I (S k ) I } (S k ) I A T I (C k ) I {{ } H k H k IR (m+n 1) (m+n 1 ) ) ( qk ( p k ) I ) ( (Axk b) = 0 )

The Preconditioner The preconditioner Spectral properties PPCG Note that ( H k = I A I (S k ) I ) } (S k ) I A T I ( k Sk 2) I {{ } P k ( Qk 0 + 0 (W k E k ) I ) where Q k = A A (S k C 1 k S k ) A A T A When x k converges to x, (S k ) A 0, (C k ) A I, then lim (Q k) = 0, k lim (W ke k ) I = 0. k

The preconditioner Spectral properties PPCG Factorization of the Preconditioner ( P k = I A I (S k ) I ) (S k ) I A T I ( k Sk 2) I can be factorized as ( ) ( ) ( I 0 I AI I 0 P k = 0 (S k ) I A T I ( k ) I 0 (S k ) I }{{} Π k ) If I k and k remain unchanged for a few iterations, the factorization of matrix Π k does not have to be updated. I k is expected to eventually settle down.

Eigenvalues Introduction The preconditioner Spectral properties PPCG P 1 k H k has at least m n + n 1 eigenvalues at 1 the other eigenvalues are positive and of the form λ = 1 + µ, µ = ut Q k u + v T (W k E k ) I v u T u + v T ( k S 2 k ) I v, where (u T, v T ) T is an eigenvector associated to λ. if µ is small: the eigenvalues of P 1 k H k are clustered around one. This is the case when x k is close to the solution.

Eigenvalues (x k far away from x ) The preconditioner Spectral properties PPCG The eigenvalues of P 1 k H k have the form λ = 1 + µ and If ( k ) i,i = δ > 0 for i I k, If ( k ) i,i = µ A A(S k ) A 2 τ + τ δ(1 τ) { (wk ) i (e k ) i for i I k and (w k ) i (e k ) i 0 δ > 0 for i I k and (w k ) i (e k ) i = 0 µ A A(S k ) A 2 τ + 1 1 τ Better distribution of the eigenvalues. A scaling of A at the beginning of the process is advisable.

The preconditioner Spectral properties PPCG Solving the augmented system by PPCG We can adopt the Projected Preconditioned Conjugate-Gradient (PPCG) [Gould, 1999], [Dollar, Gould, Schilders, Wathen, SIMAX, 2006] It is a CG procedure for solving indefinite systems: ( ) ( ) ( ) H A p g A T = C q 0 with H IR m m symmetric, C IR n n (n m) symmetric, A IR m n full rank, using preconditioners of the form: ( ) G A A T T with G symmetric, T nonsingular.

The preconditioner Spectral properties PPCG When C is nonsingular, PPCG is equivalent to applying PCG to the system (H + AC 1 A T )p = g with preconditioner: G + AT 1 A T In our case, it is equivalent to applying PCG to the system: (I + Q k + A I (S k C 1 k S k ) I A T I ) q k = (Ax k b), }{{} F k using ther preconditioner G k = I + A I ( k ) 1 I AT I,

Eigenvalues of G 1 k F k The preconditioner Spectral properties PPCG. If ( k ) i,i = (w k ) i (e k ) i for i I k, then the eigenvalues of F k satisfy: G 1 k 1 1 2 τ λ 1 + A A(S k ) A 2. τ Drawback: Differently from the previous results, no cluster of eigenvalues at 1 is guaranteed Advantage: PPCG is characterized by a minimization property and requires a fixed amount of work per iteration

Implementation issues Dynamic regularization: ( k ) i,i = { 0, if i Ik (i.e.(w k ) i (e k ) i > τ) min{ max{10 3, (w k ) i (e k ) i }, 10 2 }, otherwise. Iterative solver: PPCG with adaptive choice of the tolerance in the stopping criterion. Linear systems are solved with accuracy that increases as the solution is approached. PPCG is stopped when the preconditioned residual drops below tol = max(10 7, η k W k D k g k A T S k 1 ).

To avoid preconditioner factorizations: at iteration k + 1 freeze the set I k and the matrix k if #(IT PPCG) k 30 & card(i k+1 ) card(i k ) 10. If I k is empty (i.e. S k 1 τ): we apply PCG to the normal system (S T k A T AS k + C k ) p k = S k A T (Ax k b).

Matlab code, ɛ m = 2. 10 16. The threshold τ is set to 0.1 Initial guess x 0 = (1,..., 1) T. Succesfull termination: q k 1 q k < ɛ (1 + q k 1 ), x k x k 1 2 ɛ ( 1 + x k 2 ) P(x k g k ) x k 2 < ɛ 1 3 ( 1 + g k 2 ) or with ɛ = 10 9. P(x k g k ) x k 2 ɛ A failure is declared after 100 iterations.

Test Problems The matrix A is the transpose of the matrices in the LPnetlib subset of The University of Florida Sparse Matrix Collection. We discarded the matrices with m < 1000 and the matrices that are not full rank. A total of 56 matrices. Dimensions ranges up to 10 5 The vector b is set equal to b = A(1, 1,..., 1) T When A 1 > 10 3, we scaled the matrix using a simple row and column scaling scheme.

Numerical Results On a total of 56 test problems we succesfully solve 51 tests: 41 test problems are solved with less than 20 nonlinear iterations. In 40 tests the average number of PPCG iterations does not exceed 40. In 8 tests the solution is the null vector. At each iteration I k =, S T k AT AS k + C k I and the convergence of the linear solver is very fast.

Savings in the number of preconditioner factorizations

Percent Reduction in the dimension n We solve augmented system of reduced dimension m + n 1

Future work More experimentation, using also QMR and GMRES Develop a code for the more general problem: min l x u q(x) = 1 2 Ax b 2 2 + µ x 2 If µ > 0: A may also be rank deficient the augmented systems are regularized naturally Comparison with existing codes (e.g. BCLS (Fiedlander), PDCO (Saunders))