Infeasible Primal-Dual (Path-Following) Interior-Point Methods for Semidefinite Programming*

Similar documents
Infeasible Primal-Dual (Path-Following) Interior-Point Methods for Semidefinite Programming*

1 Outline Part I: Linear Programming (LP) Interior-Point Approach 1. Simplex Approach Comparison Part II: Semidenite Programming (SDP) Concludin

The Ongoing Development of CSDP

A polynomial-time inexact primal-dual infeasible path-following algorithm for convex quadratic SDP

2 The SDP problem and preliminary discussion

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

Lecture: Algorithms for LP, SOCP and SDP

A priori bounds on the condition numbers in interior-point methods

Exploiting Sparsity in Primal-Dual Interior-Point Methods for Semidefinite Programming.

Solving large Semidefinite Programs - Part 1 and 2

Fast Algorithms for SDPs derived from the Kalman-Yakubovich-Popov Lemma

A PREDICTOR-CORRECTOR PATH-FOLLOWING ALGORITHM FOR SYMMETRIC OPTIMIZATION BASED ON DARVAY'S TECHNIQUE

SPARSE SECOND ORDER CONE PROGRAMMING FORMULATIONS FOR CONVEX OPTIMIZATION PROBLEMS

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

An Interior-Point Method for Approximate Positive Semidefinite Completions*

Projection methods to solve SDP

Introduction to Semidefinite Programs

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Lecture: Introduction to LP, SDP and SOCP

The Q Method for Symmetric Cone Programmin

m i=1 c ix i i=1 F ix i F 0, X O.

III. Applications in convex optimization

On implementing a primal-dual interior-point method for conic quadratic optimization

Lecture 17: Primal-dual interior-point methods part II

A SECOND ORDER MEHROTRA-TYPE PREDICTOR-CORRECTOR ALGORITHM FOR SEMIDEFINITE OPTIMIZATION

Interior Point Methods in Mathematical Programming

Research Note. A New Infeasible Interior-Point Algorithm with Full Nesterov-Todd Step for Semi-Definite Optimization

6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

Semidefinite Programming

Lecture Note 5: Semidefinite Programming for Stability Analysis

A Generalized Homogeneous and Self-Dual Algorithm. for Linear Programming. February 1994 (revised December 1994)

Semidefinite Programming

Sparse Matrix Theory and Semidefinite Optimization

Primal-Dual Geometry of Level Sets and their Explanatory Value of the Practical Performance of Interior-Point Methods for Conic Optimization

A CONIC DANTZIG-WOLFE DECOMPOSITION APPROACH FOR LARGE SCALE SEMIDEFINITE PROGRAMMING

Nonlinear Optimization for Optimal Control

Lecture 10. Primal-Dual Interior Point Method for LP

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

The Q Method for Second-Order Cone Programming

12. Interior-point methods

SDP Relaxations for MAXCUT

Semidefinite Programming, Combinatorial Optimization and Real Algebraic Geometry

Parallel implementation of primal-dual interior-point methods for semidefinite programs

Advances in Convex Optimization: Theory, Algorithms, and Applications

A semidefinite relaxation scheme for quadratically constrained quadratic problems with an additional linear constraint

Research Reports on Mathematical and Computing Sciences

Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

Introduction to Semidefinite Programming I: Basic properties a

Determinant maximization with linear. S. Boyd, L. Vandenberghe, S.-P. Wu. Information Systems Laboratory. Stanford University

ORF 523 Lecture 9 Spring 2016, Princeton University Instructor: A.A. Ahmadi Scribe: G. Hall Thursday, March 10, 2016

4. Algebra and Duality

Interior Point Methods: Second-Order Cone Programming and Semidefinite Programming

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

Interior Point Algorithms for Constrained Convex Optimization

Second-order cone programming

Lecture 3 Interior Point Methods and Nonlinear Optimization

Room 225/CRL, Department of Electrical and Computer Engineering, McMaster University,

The Simplest Semidefinite Programs are Trivial

Lecture 7: Convex Optimizations

Alternating direction augmented Lagrangian methods for semidefinite programming

Implementation and Evaluation of SDPA 6.0 (SemiDefinite Programming Algorithm 6.0)

Interior Point Methods for Mathematical Programming

Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities

w Kluwer Academic Publishers Boston/Dordrecht/London HANDBOOK OF SEMIDEFINITE PROGRAMMING Theory, Algorithms, and Applications

A FULL-NEWTON STEP INFEASIBLE-INTERIOR-POINT ALGORITHM COMPLEMENTARITY PROBLEMS

c 2000 Society for Industrial and Applied Mathematics

Interior-Point Methods

6.854J / J Advanced Algorithms Fall 2008

A SUFFICIENTLY EXACT INEXACT NEWTON STEP BASED ON REUSING MATRIX INFORMATION

Convex Optimization. (EE227A: UC Berkeley) Lecture 6. Suvrit Sra. (Conic optimization) 07 Feb, 2013

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 4

Research overview. Seminar September 4, Lehigh University Department of Industrial & Systems Engineering. Research overview.

A Constraint-Reduced MPC Algorithm for Convex Quadratic Programming, with a Modified Active-Set Identification Scheme

Lecture 14 Barrier method

A Continuation Approach Using NCP Function for Solving Max-Cut Problem

A study of search directions in primal-dual interior-point methods for semidefinite programming

Alternating Direction Augmented Lagrangian Algorithms for Convex Optimization

15. Conic optimization

Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs

Agenda. Interior Point Methods. 1 Barrier functions. 2 Analytic center. 3 Central path. 4 Barrier method. 5 Primal-dual path following algorithms

12. Interior-point methods

Constraint Reduction for Linear Programs with Many Constraints

Conic Linear Optimization and its Dual. yyye

The Solution of Euclidean Norm Trust Region SQP Subproblems via Second Order Cone Programs, an Overview and Elementary Introduction

The maximal stable set problem : Copositive programming and Semidefinite Relaxations

A direct formulation for sparse PCA using semidefinite programming

Numerical Comparisons of. Path-Following Strategies for a. Basic Interior-Point Method for. Revised August Rice University

c 2005 Society for Industrial and Applied Mathematics

On Generalized Primal-Dual Interior-Point Methods with Non-uniform Complementarity Perturbations for Quadratic Programming

On the Sandwich Theorem and a approximation algorithm for MAX CUT

Primal-dual relationship between Levenberg-Marquardt and central trajectories for linearly constrained convex optimization

Convex and Nonsmooth Optimization: Assignment Set # 5 Spring 2009 Professor: Michael Overton April 23, 2009

An Interior-Point Method for Semidefinite Programming

Conic Linear Programming. Yinyu Ye

Lecture 9 Sequential unconstrained minimization

Copositive Programming and Combinatorial Optimization

Transcription:

Infeasible Primal-Dual (Path-Following) Interior-Point Methods for Semidefinite Programming* Notes for CAAM 564, Spring 2012 Dept of CAAM, Rice University Outline (1) Introduction (2) Formulation & a complexity theorem (3) Computation and numerical results (4) Comments * Yin Zhang: On extending primal-dual interiorpoint algorithms from LP to SDP, SIOPT, published in 1998 (Tech. Report 1995). 1

Primal SDP: min C X s.t. A i X = b i, i = 1 : m X 0, where C X = tr(c T X), A i, C symmetric, and X 0 means symmetric positive semidefinite. Diagonal C, A i LP SDP constraints are nonlinear: X 2 2 0 {x 11 0, x 11 x 22 x 2 12 } Dual SDP: max s.t. m i=1 b y y i A i + Z = C Z 0 2

Strong Duality: C X b y = X Z = 0 if both primal and dual are strictly feasible (Slater constraint qualification). Note: For X, Z 0, X Z = 0 XZ = 0. Proof: is trivial. X Z = tr(xz) = 0 tr(xz 1/2 Z 1/2 ) = 0 tr(z 1/2 XZ 1/2 ) = 0 λ i (Z 1/2 XZ 1/2 ) = 0 λ i (Z 1/2 XZ 1/2 ) = 0, i Z 1/2 XZ 1/2 = 0 (symmetry) (Z 1/2 X 1/2 )(Z 1/2 X 1/2 ) T = 0 Z 1/2 X 1/2 = 0 ZX = 0. 3

SDP Applications: Eigenvalue optimization Control theory Linear matrix inequalities in system and control theory Book by Boyd et al, SIAM, 1994. Combinatorial optimization Example: Max-Cut relaxation Non-convex optimization Example: QCQP Relaxations 4

Definitions: Interior-point: (X, y, Z) : X, Z 0 Infeasible Alg: allows infeasible iterates Primal-dual Alg: perturbed and damped Newton (or composite Newton) method applied to a form of optimality condition system. Standard Optimality Conditions: (X, y, Z), X, Z 0 such that i A i X b i = 0 m yi A i + Z C = 0 n(n + 1)/2 XZ = 0 n 2 This optimality system is not square. Complementarity needs symmetrization. Weighted symmetrization: (YZ 95) H P (M) = (P MP 1 + P T M T P T )/2 XZ = τi H P (XZ) = τi complementarity H P (XZ) = 0 the central path H P (XZ) = µi 5

H P unifies a family of primal-dual formulations. 3 most promising members: AHO (Alizadeh/Haeberly/Overton 94): P = I XZ + ZX = 0 KSH (Kojima et al 94, Monteiro 95): P k = [Z k ] 1/2 Z k XZ + ZXZ k = 0 NT (Nesterov/Todd 95): W = X.5 (X.5 ZX.5 ).5 X.5, P k = [W k ].5 A sequence of equiv. systems for varying P k : Formulation AHO KSH NT Real Newton? Yes No No AHO difficult to analyze its complexity KSH Complexity results were obtained for feasible or short-step methods In favor of: infeasible long-step methods implementable and most efficient in practice 6

Central path: {(X, Z) 0 : XZ = µi 0} On the central path: λ i (XZ) = µ, i λ 2 (XZ) central path λ 1 (XZ) Short-step uses a narrow neighborhood. Long-step uses a wide neighborhood. 7

A narrow neighborhood: α (0, 1) N α = {(X, Z) 0 : XZ µi F αµ} where µ is the normalized duality gap µ = tr(xz)/n. A wide neighborhood: γ (0, 1) N γ = {(X, Z) 0 : λ min (XZ) γµ} where µ = tr(xz)/n. A neighborhood with feasibility priority: N γ = { (X, Z) N γ : r r 0 µ µ 0 where r measures infeasibility. } 8

Column-Stacking Operator vec : R n n R n2 vec [ 1 3 2 4 ] = (1 2 3 4) T SDP: Find v = (X, y, Z), X, Z 0, such that F k (v) = A T y + vecz vecc A(vecX) b vec(z k XZ + ZXZ k ) where A T = [veca 1 veca m ]. = 0, Infeasible long-step Algorithm: (YZ 95) Choose v 0 = (X 0, y 0, Z 0 ), X 0, Z 0 0, For k = 0,1,2,..., until µ k 2 L do (1) solve F k (vk ) v = F k (v k ) + σµ k p k (2) choose α, v k+1 v k + α v N γ End where σ (0, 1), and the centering term p k is derived from Z k XZ + ZXZ k 2σµ k Z k = 0. 9

The key to convergence analysis is to estimate the 2nd-order term. Key Lemma: (YZ 95) Under the conditions that (1) a solution (X, y, Z ) exists, (2) X 0 = Z 0 = ρi for ρ max { λ 1 (X ), λ 1 (Z ), tr(x + Z ) n } ( ) (3) (X, Z) N γ, Z 2( X Z)Z 1 1 λ1 (XZ) tr(xz) 2 F 19n λ n (XZ) γ Polynomial Complexity: (YZ 95) Theorem: Under the conditions (1)-(3), the algorithm terminates (µ k 2 L ) in at most O(n 2.5 L)-iterations. Global convergence holds without (*) in (2). 10

Computation: How difficult is it to solve SDP? Distinctions from LP: (1) Special code for special prob., e.g. LP (2) Cost for linear system varies with method (3) Parallelism more important than sparsity Linear system F k (vk ) v = F k (v k ) + σµ k p k : 0 A T I A 0 0 E 0 F vec X y vec Z = E and F are no longer diagonal. vecr d r p vecr c Solution Procedure: (1) M = A(E 1 F)A T. (2) M y = r p + A[E 1 (FvecR d vecr c )]. (3) Z = R d m i=1 y i A i. (4) X = σµz 1 X H(X( Z)Z 1 ). Most computation is in steps (1)-(2). M is m m, m [0, n(n + 1)/2]., 11

Consider forming: M = A(E 1 F)A T Notation: Kronecker product U V = [u ij V ]) KSH: E = Z Z, F = (ZX I + I ZX)/2 M 0, M ij = tr(z 1 A i XA j ) AHO: E = Z I + I Z, F = X I + I X M asymmetric (LU) ZP i + P i Z = A i Lyapunov eqn for P i M ij = tr(p i (XA j + A j X)) NT: W = X.5 (X.5 ZX.5 ).5 X.5 E = I I, F = W W M 0, M ij = tr(a i W A j W ) 2 Cholesky + SVD for W (TTT 96) work(aho) > work(nt) > work(ksh) 12

Leading Cost* per Iter for Dense Problems (Form: O(m(m + n)n 2 ), Solve O(m 3 )) m = O(1) O(n) O(n 2 ) Form O(n 3 )* O(n 4 )* O(n 6 ) Solve O(1) O(n 3 ) O(n 6 )* Forming M is most expensive. Bad news: O(n 4 ) per iteration most likely Good news: Rich parallelism Coarse grain: compute M ij independently Fine grain: rich in matrix multiplications Expect (almost) linear speed-up 13

Preliminary Implementation: Predictor-Corrector Algorithm: Choose v 0 = (X 0, y 0, Z 0 ), X 0, Z 0 0, For k = 0,1,2,..., until µ k 2 L do (1) F k (vk ) v p = F k (v k ). Choose σ k. (2) F k (vk ) v c = F k (v k + v p ) + σ k µ k p k (3) Let v = v p + v c (4) choose α, v k+1 v k + α v (interior) End (Extension of Mehrotra s framework to SDP) LiteSDP, a short MATLAB code for general problems, dense or sparse only KSH formulation implemented backtracking for α using Cholesky initial point X 0 = Z 0 = ni tested on randomly generated problems 14

General Problem: m = 400, n = 200 Iter. No P_inf D_inf Gap Cpu -------------------------------------------- iter 0: 9.62e-01 1.15e+00 1.70e+03 0.77 iter 1: 1.52e-15 9.60e-01 1.33e+00 1.45 iter 2: 1.63e-15 2.12e-16 3.11e+00 1.81 iter 3: 1.31e-15 2.12e-16 2.18e+00 1.51... iter 9: 1.20e-15 2.09e-16 1.34e-03 1.52 iter 10: 1.27e-15 2.10e-16 3.83e-04 1.73 iter 11: 1.36e-15 2.11e-16 4.56e-05 1.51 iter 12: 1.26e-15 2.10e-16 9.23e-06 1.92 iter 13: 1.28e-15 2.00e-16 1.09e-06 1.82 iter 14: 1.29e-15 2.02e-16 1.10e-07 1.54 iter 15: 1.17e-15 2.02e-16 1.11e-08 1.78 iter 16: 1.19e-15 2.06e-16 1.22e-09 --- ------------------------------------------- [m n]=[400 200] obj=-739.355 cputime=26.09 Most (Mac Air) time spent on forming M 15

Max-Cut: n = 500 (same general code) Iter. No P_inf D_inf Gap Cpu ------------------------------------------ iter 0: 2.04e+01 5.08e+00 4.47e+04 0.27 iter 1: 0.00e+00 2.51e-16 8.23e-01 1.91 iter 2: 0.00e+00 2.17e-16 5.46e-01 1.98 iter 3: 0.00e+00 1.22e-16 2.37e-01 2.16 iter 4: 0.00e+00 1.03e-16 1.47e-01 1.80 iter 5: 0.00e+00 1.05e-16 5.90e-02 2.01 iter 6: 0.00e+00 1.06e-16 1.25e-02 1.82 iter 7: 0.00e+00 9.74e-17 7.59e-03 1.87 iter 8: 0.00e+00 1.07e-16 9.10e-04 1.89 iter 9: 0.00e+00 1.09e-16 1.07e-04 1.79 iter 10: 0.00e+00 9.68e-17 1.08e-05 2.00 iter 11: 0.00e+00 1.04e-16 1.09e-06 1.80 iter 12: 0.00e+00 9.62e-17 1.10e-07 1.76 iter 13: 0.00e+00 1.05e-16 1.11e-08 1.91 iter 14: 0.00e+00 1.06e-16 1.12e-09 --- --------------------------------------------- [m n]=[500 500] obj=-3633.6 cputime=25.2 16

Which is most efficient? AHO, KSH, NT? Comparative study I: AHO (94-95) AHO: least iterations, more robust, fast local convergence Comparison on iteration only, no CPU (AHO requires more work per iteration.) Comparative study II: TTT (Mar. 96) No. Iter/CPU per iter on Max-Cut n = 50 AHO KSH NT(Schur) NT(LS) 7.4/5.1 7.6/3.8 8.4/4.0 8.4/4.4 37.74 28.88 33.6 36.96 Schur: LU on M = AE 1 FA T LS: QR on ÂT, where M = ÂÂT NT(LS) achieves higher accuracy (major advantage of NT over AHO & KSH) 17

How to write M = AE 1 FA T = ÂÂT? NT: 2 Chol + SVD W 1/2 (TTT 3/96)  = A(W 1/2 W 1/2 ) e T i  = vec(w 1/2 A i W 1/2 ) T AHO: M = AE 1 FA T is asymmetric. KSH: Cholesky factors X = L x L T x, Z = L z L T z are already available if using backtracking.  e T i  = A(L x L T = vec(l 1 z z ) A i L x ) T Least cost/iteration among the three 18