III. Applications in convex optimization

Similar documents
Sparse Matrix Theory and Semidefinite Optimization

1. Introduction. We consider conic linear optimization problems (conic LPs) minimize c T x (1.1)

SMCP Documentation. Release Martin S. Andersen and Lieven Vandenberghe

The Ongoing Development of CSDP

Research Reports on Mathematical and Computing Sciences

Semidefinite Programming

arxiv: v1 [math.oc] 26 Sep 2015

Parallel implementation of primal-dual interior-point methods for semidefinite programs

SEMIDEFINITE PROGRAM BASICS. Contents

Introduction to Semidefinite Programs

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

Lecture: Introduction to LP, SDP and SOCP

Semidefinite Programming Basics and Applications

Chordal Sparsity in Interior-Point Methods for Conic Optimization

Degeneracy in Maximal Clique Decomposition for Semidefinite Programs

Semidefinite Programming, Combinatorial Optimization and Real Algebraic Geometry

18. Primal-dual interior-point methods

Distributed Control of Connected Vehicles and Fast ADMM for Sparse SDPs

Continuous Optimisation, Chpt 9: Semidefinite Problems

Lecture 14: Optimality Conditions for Conic Problems

Lecture: Examples of LP, SOCP and SDP

Introduction to Semidefinite Programming I: Basic properties a

6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC

4. Algebra and Duality

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

Lecture Note 5: Semidefinite Programming for Stability Analysis

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

Fast ADMM for Sum of Squares Programs Using Partial Orthogonality

PENNON A Generalized Augmented Lagrangian Method for Convex NLP and SDP p.1/39

Sparse Optimization Lecture: Basic Sparse Optimization Models

L. Vandenberghe EE236C (Spring 2016) 18. Symmetric cones. definition. spectral decomposition. quadratic representation. log-det barrier 18-1

12. Interior-point methods

CSCI 1951-G Optimization Methods in Finance Part 10: Conic Optimization

LECTURE 13 LECTURE OUTLINE

A solution approach for linear optimization with completely positive matrices

15. Conic optimization

The maximal stable set problem : Copositive programming and Semidefinite Relaxations

Geometric problems. Chapter Projection on a set. The distance of a point x 0 R n to a closed set C R n, in the norm, is defined as

Convex Optimization. (EE227A: UC Berkeley) Lecture 6. Suvrit Sra. (Conic optimization) 07 Feb, 2013

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

EE 227A: Convex Optimization and Applications October 14, 2008

SDPARA : SemiDefinite Programming Algorithm PARAllel version

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

Summer School: Semidefinite Optimization

Nonlinear Optimization for Optimal Control

Primal-Dual Geometry of Level Sets and their Explanatory Value of the Practical Performance of Interior-Point Methods for Conic Optimization

Preliminaries Overview OPF and Extensions. Convex Optimization. Lecture 8 - Applications in Smart Grids. Instructor: Yuanzhang Xiao

Continuous Optimisation, Chpt 9: Semidefinite Optimisation

Fast Algorithms for SDPs derived from the Kalman-Yakubovich-Popov Lemma

Real Symmetric Matrices and Semidefinite Programming

Problem structure in semidefinite programs arising in control and signal processing

Iterative LP and SOCP-based. approximations to. sum of squares programs. Georgina Hall Princeton University. Joint work with:

arxiv: v2 [math.oc] 15 Aug 2018

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 4

Convex Optimization and Modeling

ORF 523 Lecture 9 Spring 2016, Princeton University Instructor: A.A. Ahmadi Scribe: G. Hall Thursday, March 10, 2016

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

Largest dual ellipsoids inscribed in dual cones

A Distributed Newton Method for Network Utility Maximization, II: Convergence

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

Additional Homework Problems

SPARSE SECOND ORDER CONE PROGRAMMING FORMULATIONS FOR CONVEX OPTIMIZATION PROBLEMS

Research Reports on Mathematical and Computing Sciences

More First-Order Optimization Algorithms

Robust and Optimal Control, Spring 2015

Interior Point Methods: Second-Order Cone Programming and Semidefinite Programming

Solving large Semidefinite Programs - Part 1 and 2

EE364b Convex Optimization II May 30 June 2, Final exam

LP. Kap. 17: Interior-point methods

12. Interior-point methods

9.1 Preconditioned Krylov Subspace Methods

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 2

Scientific Computing

Semidefinite Programming

BBCPOP: A Sparse Doubly Nonnegative Relaxation of Polynomial Optimization Problems with Binary, Box and Complementarity Constraints

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Iterative Methods. Splitting Methods

m i=1 c ix i i=1 F ix i F 0, X O.

Research Reports on Mathematical and Computing Sciences

Second-order cone programming

5. Duality. Lagrangian

Semidefinite Programming

Lecture 6: Conic Optimization September 8

AMS526: Numerical Analysis I (Numerical Linear Algebra)

A Julia JuMP-based module for polynomial optimization with complex variables applied to ACOPF

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

Linear and non-linear programming

Example: feasibility. Interpretation as formal proof. Example: linear inequalities and Farkas lemma

Optimization Methods. Lecture 23: Semidenite Optimization

Canonical Problem Forms. Ryan Tibshirani Convex Optimization

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

6.854J / J Advanced Algorithms Fall 2008

Lecture 5. The Dual Cone and Dual Problem

Convex Optimization M2

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

Modern Optimal Control

Infeasible Primal-Dual (Path-Following) Interior-Point Methods for Semidefinite Programming*

The proximal mapping

Research Reports on Mathematical and Computing Sciences

minimize x x2 2 x 1x 2 x 1 subject to x 1 +2x 2 u 1 x 1 4x 2 u 2, 5x 1 +76x 2 1,

Transcription:

III. Applications in convex optimization nonsymmetric interior-point methods partial separability and decomposition partial separability first order methods interior-point methods

Conic linear optimization primal: minimize c T x subject to Ax = b x C dual: maximize b T y subject to A T y + s = c s C C is a proper cone (convex, closed, pointed, with nonempty interior) C = {z z T x 0 for all x C} is the dual cone widely used in recent literature on convex optimization Interior-point methods a convenient format for extending interior-point methods from linear optimization to general convex optimization Modeling a small number of primitive cones is sufficient to model most convex constraints encountered in practice Applications in convex optimization 126

Symmetric cones most current solvers and modeling systems use three types of cones nonnegative orthant second-order cone positive semidefinite cone these cones are not only self-dual but symmetric (self-scaled) symmetry is exploited in primal-dual symmetric interior-point methods large gaps in (linear algebra) complexity between the three cones (see the examples on page 5 6) Applications in convex optimization 127

Sparse semidefinite optimization problem Primal problem minimize tr(cx) subject to tr(a i X) = b i, i = 1,..., m X 0 Dual problem maximize subject to b T y m y i A i + S = C i=1 S 0 Aggregate sparsity pattern the union of the patterns of C, A 1,..., A m feasible X is usually dense, even for problems with aggregate sparsity feasible S is sparse with sparsity pattern E Applications in convex optimization 128

Equivalent nonsymmetric conic LPs Primal problem minimize tr(cx) subject to tr(a i X) = b i, i = 1,..., m X C Dual problem maximize subject to b T y m y i A i + S = C i=1 S C variables X and S are sparse matrices in S n E C = Π E (S n +) is cone of PSD completable matrices with sparsity pattern E C = S n + S n E is cone of PSD matrices with sparsity pattern E C is not self-dual; no symmetric interior-point methods Applications in convex optimization 129

Nonsymmetric interior-point methods minimize tr(cx) subject to tr(a i X) = b i, i = 1,..., m X Π E (S n +) can be solved by nonsymmetric primal or dual barrier methods logarithmic barriers for cone Π E (S n +) and its dual cone S n + S n E : φ (X) = sup S ( tr(xs) + log det S), φ(s) = log det S fast evaluation of barrier values and derivatives if pattern is chordal (Fukuda et al. 2000, Burer 2003, Srijungtongsiri and Vavasis 2004, Andersen et al. 2010) Applications in convex optimization 130

Primal path-following method Central path: solution X(µ), y(µ), S(µ) of tr(a i X) = b i, i = 1,..., m m y j A j + S = C j=1 µ φ (X) + S = 0 Search direction at iterate X, y, S: solve linearized central path equations tr(a i X) = r i, i = 1,..., m m y i A i + S = C i=1 µ 2 φ (X)[ X] + S = µ φ (X) S Applications in convex optimization 131

Dual path-following method Central path: an equivalent set of equations is tr(a i X) = b i, i = 1,..., m m y j A j + S = C j=1 X + µ φ(s) = 0 Search direction at iterate X, y, S: solve linearized central path equations tr(a i X) = r i, i = 1,..., m m y i A i + S = C i=1 X + µ 2 φ(s)[ S] = µ φ(s) X Applications in convex optimization 132

Computing search directions eliminating X, S from linearized equation gives H y = g in a primal method H ij is the inner product of A i and 2 φ (X)[A j ]: H ij = tr(a i 2 φ (X)[A j ]) in a dual method H ij is the inner product of A i and 2 φ(s)[a j ]: H ij = tr(a i 2 φ(s)[a j ]) the algorithms from lecture 2 can be used to evaluate gradient and Hessians the system H y = g is solved via dense Cholesky or QR factorization Applications in convex optimization 133

Sparsity patterns sparsity patterns from University of Florida Sparse Matrix Collection m = 200 constraints random data with 0.05% nonzeros in Ai relative to E 500 1000 500 500 1000 2000 1500 1000 1000 2000 3000 2500 4000 1500 1500 500 1000 1500 2000 500 1000 1500 2000 3000 500 1000 1500 2000 2500 3000 1000 2000 3000 4000 rs228 rs35 rs200 rs365 n = 1,919 n = 2,003 n = 3,025 n = 4,704 1000 2000 2000 2000 5000 4000 4000 3000 4000 5000 8000 10000 7000 1000 2000 3000 4000 5000 6000 7000 15000 8000 6000 6000 10000 6000 10000 20000 12000 25000 14000 2000 4000 6000 8000 10000 2000 4000 6000 8000 100001200014000 30000 5000 10000 15000 20000 25000 30000 rs1555 rs828 rs1184 rs1288 n = 7,479 n = 10,800 n = 14,822 n = 30,401 Applications in convex optimization 134

Results n DSDP SDPA SDPA-C SDPT3 SeDuMi SMCP 1919 1.4 30.7 5.7 10.7 511.2 2.3 2003 4.0 34.4 41.5 13.0 521.1 15.3 3025 2.9 128.3 6.0 33.0 1856.9 2.2 4704 15.2 407.0 58.8 99.6 4347.0 18.6 n DSDP SDPA-C SMCP 7479 22.1 23.1 9.5 10800 482.1 1812.8 311.2 14822 791.0 2925.4 463.8 30401 mem 2070.2 320.4 average time per iteration for different solvers SMCP uses nonsymmetric matrix cone approach (Andersen et al. 2010) code and more benchmarks at github.com/cvxopt/smcp Applications in convex optimization 135

Band pattern SDPs of order n with bandwidth 11 and m = 100 equality constraints Time per iteration 10 4 10 3 10 2 10 1 10 0 M1 M2 CSDP DSDP SDPA SDPA-C SDPT3 SeDuMi 10-1 10-2 10 2 10 3 10 4 n nonsymmetric solver SMCP (two variants M1, M2): complexity is linear in n (Andersen et al. 2010) Applications in convex optimization 136

Arrow pattern matrix norm minimization of page 6 matrices of size p q with q = 10 with m = 100 variables Time per iteration 10 3 10 2 10 1 10 0 M1 M2 CSDP DSDP SDPA SDPA-C SDPT3 SeDuMi 10-1 10-2 10 2 10 3 10 4 p +q nonsymmetric solver SMCP (M1, M2): complexity linear in p Applications in convex optimization 137

III. Applications in convex optimization nonsymmetric interior-point methods partial separability and decomposition partial separability first order methods interior-point methods

Partial separability Partially separable function (Griewank and Toint 1982) f(x) = l f k (P βk x) k=1 x is an n-vector; β 1,..., β l are (small) overlapping index sets in {1, 2,..., n} Example: f(x) = f 1 (x 1, x 4, x 5 ) + f 2 (x 1, x 3 ) + f 3 (x 2, x 3 ) + f 4 (x 2, x 4 ) Partially separable set C = {x R n x βk C k, k = 1,..., l} the indicator function is a partially separable function Applications in convex optimization 138

Interaction graph vertices V = {1, 2,..., n}, {i, j} E i, j β k for some k if {i, j} E, then f is separable in x i and x j if other variables are fixed: f(x + se i + te j ) = f(x + se i ) + f(x + te j ) f(x) x R n, s, t R Example: f(x) = f 1 (x 1, x 4, x 5 ) + f 2 (x 1, x 3 ) + f 3 (x 2, x 3 ) + f 4 (x 2, x 4 ) 3 1 2 4 5 Applications in convex optimization 139

Example: PSD completable cone with chordal pattern for chordal E, the cone Π E (S n +) is partially separable (see page 104) Π E (S n +) = {X S n E X γi γ i 0 for all cliques γ i } the interaction graph is chordal Example: chordal sparsity pattern, clique tree, clique tree of interaction graph 1 2 3 4 5, 6 5 3, 4 (5, 5), (6, 5), (6, 6) (5, 5) (3, 3), (4, 3), (5, 3), (4, 4), (5, 4) 5 6 4 2 3, 4 1 (4, 4) (2, 2), (4, 2) (3, 3), (4, 3), (4, 4) (1, 1), (3, 1), (4, 1) Applications in convex optimization 140

Partially separable convex optimization minimize f(x) = l f k (P βk x) k=1 Equivalent problem minimize subject to l k=1 f k ( x k ) x = P x we introduced splitting variables x k to make cost function separable P, x are stacked matrix and vector P = P β 1. P βl, x = x 1. x l, P T P is diagonal ((P T P ) ii is the number of sets β k that contain index i) Applications in convex optimization 141

Decomposition via first-order methods Reformulated problem and its its dual (f k is conjugate function of f k) minimize l k=1 f k ( x k ) subject to x range(p ) maximize l k=1 f k ( s k) subject to s nullspace(p T ) cost functions are separable diagonal property of P T P makes projections on range inexpensive Algorithms: many algorithms can exploit these properties, for example Douglas-Rachford (DR) splitting of the primal alternating direction method of multipliers (ADMM) Applications in convex optimization 142

Example: sparse nearest matrix problems find nearest sparse PSD-completable matrix with given sparsity pattern minimize X A 2 F subject to X Π E (S n +) find nearest sparse PSD matrix with given sparsity pattern minimize subject to S + A 2 F S S n + S n E these two problems are duals: K = Π E (S n +) K = (S n + S n E ) A Applications in convex optimization 143

Decomposition methods from the decomposition theorems (pages 82 and 104), the problems can be written primal: minimize X A 2 F subject to X γi γ i 0 for all cliques γ i dual: minimize A + P T i V c γ i H i P γi 2 F subject to H i 0 for all i V c Algorithms Dykstra s algorithm (dual block coordinate ascent) (fast) dual projected gradient algorithm (FISTA) Douglas-Rachford splitting, ADMM sequence of projections on PSD cones of order γ i (eigenvalue decomposition) Applications in convex optimization 144

Results matrices from University of Florida sparse matrix collection n density #cliques avg. clique size max. clique 20141 2.80e-3 1098 35.7 168 38434 1.25e-3 2365 28.1 188 57975 9.04e-4 8875 14.9 132 79841 9.71e-4 4247 44.4 337 114599 2.02e-4 7035 18.9 58 total runtime (sec) time/iteration (sec) n FISTA Dykstra DR FISTA Dykstra DR 20141 2.5e2 3.9e1 3.8e1 1.0 1.6 1.5 38434 4.7e2 4.7e1 6.2e1 2.1 1.9 2.5 57975 > 4hr 1.4e2 1.1e3 3.5 5.7 6.4 79841 2.4e3 3.0e2 2.4e2 6.3 7.6 9.7 114599 5.3e2 5.5e1 1.0e2 2.6 2.2 4.0 (Sun and Vandenberghe 2015) Applications in convex optimization 145

Conic optimization with partially separable cones minimize subject to c T x Ax = b x C assume C is partially separable: C = {x R n P βk x C k, k = 1,..., l} most important application is sparse semidefinite programming (C is vectorized PSD completable cone) bottleneck in interior-point methods is Schur complement equation AH 1 A T y = r (in a primal barrier method, H is the Hessian of the barrier for C) coefficient of Schur complement equation is often dense, even for sparse A Applications in convex optimization 146

Reformulation minimize subject to c T x Ax = b P βk x C k, k = 1,..., l introduce l splitting variables x k = P γk x and add consistency constraints x range(p ) where x = x 1. x l choose c, à such that ÃP = A and c T P = c T, P = P 1. P l Converted problem minimize subject to c T x à x = b x C 1 C l x range(p ) Applications in convex optimization 147

Chordal structure in interaction graph suppose the interaction graph is chordal, and the sets β k are cliques the cliques β k that contain a given index j form a subtree of the clique tree therefore the consistency constraint x range(p ) is equivalent to P αj (P T β k x k P T β j x j ) = 0 for each vertex j and its parent k in a clique tree E αk (E T β k x k E T β i x i )=0 α i β i \ α i x i C i P αj (P T β j x j E T β k x k )=0 α k β k \ α k x k C k α j β j \ α j x j C j α i is the intersection of β i and its parent Applications in convex optimization 148

Schur complement system of converted problem minimize subject to c T x à x = b x C 1 C l B x = 0 (consistency eqs.) Schur complement equation in interior-point method [ ÃH 1 à T ÃH 1 B T BH 1 à T BH 1 B T ] [ y u ] = [ r1 r 2 ] H is block-diagonal (in primal barrier method, the Hessian of C 1 C k ) larger than Schur complement system before conversion however 1,1 block is often sparse for semidefinite optimization, this is known as the clique-tree conversion method (Fukuda et al. 2000, Kim et al. 2011) Applications in convex optimization 149

Example 1 2 3 4 5, 6 5 3, 4 (5, 5), (6, 5), (6, 6) (5, 5) (3, 3), (4, 3), (5, 3), (4, 4), (5, 4) 5 6 4 2 3, 4 1 (4, 4) (2, 2), (4, 2) (3, 3), (4, 3), (4, 4) (1, 1), (3, 1), (4, 1) a 6 6 matrix X with this pattern is positive semidefinite if and only if the matrices X γ1 γ 1 = X γ3 γ 3 = X 11 X 13 X 14 X 31 X 33 X 34 X 41 X 43 X 44 are positive semidefinite X 33 X 34 X 35 X 43 X 44 X 45 X 53 X 54 X 55, X γ2 γ 2 =, X γ4 γ 4 = [ ] X22 X 24, X 42 X 44 [ ] X55 X 56 X 65 X 66 Applications in convex optimization 150

Example 1 2 3 4 5, 6 5 3, 4 (5, 5), (6, 5), (6, 6) (5, 5) (3, 3), (4, 3), (5, 3), (4, 4), (5, 4) 5 6 4 2 3, 4 1 (4, 4) (2, 2), (4, 2) (3, 3), (4, 3), (4, 4) (1, 1), (3, 1), (4, 1) define a splitting variable for each of the four submatrices X 1 S 4, X2 S 2, X3 S 4, X4 S 2 add consistency constraints [ ] X1,22 X1,23 X 1,32 X1,33 = [ ] X3,11 X3,12, X2,22 = X X 3,22, X3,33 = X 4,11 3,21 X3,22 Applications in convex optimization 151

Summary: u and sparse semidefinite optimization sparse SDPs with chordal sparsity are partially separable minimize tr(cx) subject to tr(a i X) = b i, i = 1,..., m X γk γ k 0 k = 1,..., l introducing splitting variables one can reformulate this as minimize subject to l k=1 l k=1 tr( C k Xk ) tr(ãik X k ) = b i, X k 0, k = 1,..., l consistency constraints i = 1,..., m this was first proposed as a technique for speeding up interior-point methods also useful in combination with first-order splitting methods (Lu et al. 2007, Lam et al. 2011, Dall Anese et al. 2013, Sun et al. 2014,... ) useful for distributed algorithms (Pakazad et al. 2014) Applications in convex optimization 152