Sparse Matrix Theory and Semidefinite Optimization

Similar documents
III. Applications in convex optimization

arxiv: v1 [math.oc] 26 Sep 2015

Research Reports on Mathematical and Computing Sciences

Degeneracy in Maximal Clique Decomposition for Semidefinite Programs

Distributed Control of Connected Vehicles and Fast ADMM for Sparse SDPs

Chordal Sparsity in Interior-Point Methods for Conic Optimization

Semidefinite Programming

Parallel implementation of primal-dual interior-point methods for semidefinite programs

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

Research Reports on Mathematical and Computing Sciences

Fast Algorithms for SDPs derived from the Kalman-Yakubovich-Popov Lemma

Lecture Note 5: Semidefinite Programming for Stability Analysis

Preliminaries Overview OPF and Extensions. Convex Optimization. Lecture 8 - Applications in Smart Grids. Instructor: Yuanzhang Xiao

Equivalent relaxations of optimal power flow

Research Reports on Mathematical and Computing Sciences

MATHEMATICAL ENGINEERING TECHNICAL REPORTS. Exact SDP Relaxations with Truncated Moment Matrix for Binary Polynomial Optimization Problems

Research Reports on Mathematical and Computing Sciences

Positive Semidefinite Matrix Completions on Chordal Graphs and Constraint Nondegeneracy in Semidefinite Programming

Fast ADMM for Sum of Squares Programs Using Partial Orthogonality

POLYNOMIAL OPTIMIZATION WITH SUMS-OF-SQUARES INTERPOLANTS

Probabilistic Graphical Models

SMCP Documentation. Release Martin S. Andersen and Lieven Vandenberghe

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

Approximation Algorithms

COURSE ON LMI PART I.2 GEOMETRY OF LMI SETS. Didier HENRION henrion

Introduction to Semidefinite Programming I: Basic properties a

18. Primal-dual interior-point methods

Semidefinite Programming, Combinatorial Optimization and Real Algebraic Geometry

Semidefinite Programming Basics and Applications

Lecture: Examples of LP, SOCP and SDP

A Julia JuMP-based module for polynomial optimization with complex variables applied to ACOPF

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

Explicit Sensor Network Localization using Semidefinite Programming and Clique Reductions

15. Conic optimization

A NEW SECOND-ORDER CONE PROGRAMMING RELAXATION FOR MAX-CUT PROBLEMS

1. Introduction. We consider conic linear optimization problems (conic LPs) minimize c T x (1.1)

Real Symmetric Matrices and Semidefinite Programming

AMS526: Numerical Analysis I (Numerical Linear Algebra)

L. Vandenberghe EE236C (Spring 2016) 18. Symmetric cones. definition. spectral decomposition. quadratic representation. log-det barrier 18-1

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

CSCI 1951-G Optimization Methods in Finance Part 10: Conic Optimization

Nonlinear Optimization for Optimal Control

Semidefinite Programming

The Ongoing Development of CSDP

Introduction to Semidefinite Programs

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

How to generate weakly infeasible semidefinite programs via Lasserre s relaxations for polynomial optimization

Sparse Optimization Lecture: Basic Sparse Optimization Models

arxiv: v2 [math.oc] 15 Aug 2018

Problem structure in semidefinite programs arising in control and signal processing

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

Further Relaxations of the SDP Approach to Sensor Network Localization

V C V L T I 0 C V B 1 V T 0 I. l nk

9. Numerical linear algebra background

Preliminaries and Complexity Theory

Chordally Sparse Semidefinite Programs

A solution approach for linear optimization with completely positive matrices

6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC

CS675: Convex and Combinatorial Optimization Fall 2016 Combinatorial Problems as Linear and Convex Programs. Instructor: Shaddin Dughmi

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3

Robust and Optimal Control, Spring 2015

9. Numerical linear algebra background

The maximal stable set problem : Copositive programming and Semidefinite Relaxations

Fisher Information in Gaussian Graphical Models

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Chapter 3. Some Applications. 3.1 The Cone of Positive Semidefinite Matrices

Homework 2 Foundations of Computational Math 2 Spring 2019

SEMIDEFINITE PROGRAM BASICS. Contents

Convex Optimization of Graph Laplacian Eigenvalues

12. Cholesky factorization

A Modular AC Optimal Power Flow Implementation for Distribution Grid Planning

Modeling with semidefinite and copositive matrices

Dimension reduction for semidefinite programming

The doubly negative matrix completion problem

Lecture 2: Linear Algebra Review

Second Order Cone Programming Relaxation of Nonconvex Quadratic Optimization Problems

Copositive Plus Matrices

Strong duality in Lasserre s hierarchy for polynomial optimization

Conic Approximation with Provable Guarantee for Traffic Signal Offset Optimization

INTERIOR-POINT METHODS ROBERT J. VANDERBEI JOINT WORK WITH H. YURTTAN BENSON REAL-WORLD EXAMPLES BY J.O. COLEMAN, NAVAL RESEARCH LAB

Interior Point Algorithms for Constrained Convex Optimization

6.854J / J Advanced Algorithms Fall 2008

Further Relaxations of the SDP Approach to Sensor Network Localization

Distance Geometry-Matrix Completion

Convex relaxation. In example below, we have N = 6, and the cut we are considering

A PARALLEL INTERIOR POINT DECOMPOSITION ALGORITHM FOR BLOCK-ANGULAR SEMIDEFINITE PROGRAMS IN POLYNOMIAL OPTIMIZATION

A direct formulation for sparse PCA using semidefinite programming

Introduction and Math Preliminaries

Convex Optimization of Graph Laplacian Eigenvalues

Efficient Algorithm for Large-and-Sparse LMI Feasibility Problems

Tamás Terlaky George N. and Soteria Kledaras 87 Endowed Chair Professor. Chair, Department of Industrial and Systems Engineering Lehigh University

11. Equality constrained minimization

The Solution of Euclidean Norm Trust Region SQP Subproblems via Second Order Cone Programs, an Overview and Elementary Introduction

Copositive Programming and Combinatorial Optimization

Sta$s$cal models and graphs

LU Factorization. Marco Chiarandini. DM559 Linear and Integer Programming. Department of Mathematics & Computer Science University of Southern Denmark

Exploiting Chordal Structure in Polynomial Ideals: A Gröbner Bases Approach

Convex relaxation. In example below, we have N = 6, and the cut we are considering

Semidefinite Programming Duality and Linear Time-invariant Systems

Graph structure in polynomial systems: chordal networks

Transcription:

Sparse Matrix Theory and Semidefinite Optimization Lieven Vandenberghe Department of Electrical Engineering University of California, Los Angeles Joint work with Martin S. Andersen and Yifan Sun Third VORACE Workshop Toulouse, 20-22 September, 2016

Semidefinite program (SDP) minimize tr(c X) subject to tr(a i X) = b i, i = 1,..., m X 0 variable X is n n symmetric matrix; X 0 means X is positive semidefinite matrix inequalities arise naturally in many areas (for example, control, statistics) used in convex modeling systems (CVX, YALMIP, CVXPY,... ) relaxations of nonconvex quadratic and polynomial optimization in many applications the coefficients A i, C are sparse optimal X is typically dense, even for sparse A i, C 1

SDP with band structure cost of solving SDP with banded matrices (bandwidth 11, 100 constraints) Time per iteration (seconds) 10 3 10 2 10 1 10 0 10 1 O(n 2 ) SDPT3 SeDuMi 10 2 10 2 10 3 for bandwidth 1 (linear program), cost/iteration is linear in n for bandwidth > 1, cost grows as n 2 or faster n (from Andersen, Dahl, Vandenberghe 2010) 2

Power flow optimization an optimization problem with non-convex quadratic constraints Variables complex voltage v i at each node (bus) of the network complex power flow s i j entering the link (line) from node i to node j Non-convex constraints (lower) bounds on voltage magnitudes v min v i v max flow balance equations: bus i s i j g i j s ji bus j s i j + s ji = ḡ i j v i v j 2 g i j is admittance of line from node i to j 3

Semidefinite relaxation of optimal power flow problem introduce matrix variable X = Re (vv H ), i.e., with elements X i j = Re (v i v j ) voltage bounds and flow balance equations are convex in X: v min v i v max vmin 2 X ii vmax 2 s i j + s ji = ḡ i j v i v j 2 s i j + s ji = ḡ i j (X ii + X j j 2X i j ) replace constraint X = Re (vv H ) with weaker constraint X 0 relaxation is exact if optimal X has rank two Sparsity in SDP relaxation: off-diagonal X i j appears in constraints only if there is a line between buses i and j (Jabr 2006, Bai et al. 2008, Lavaei and Low 2012,... ) 4

Outline minimize tr(c X) subject to tr(a i X) = b i, i = 1,..., m X 0 structure in solution X that results from sparsity in coefficients A i, C applications of sparse matrix theory to semidefinite optimization algorithms Outline 1. Chordal graphs 2. Decomposition of sparse matrix cones 3. Multifrontal algorithms for barrier functions 4. Minimum rank positive semidefinite completion

Symmetric sparsity pattern 1 2 5 A = 3 4 A 11 A 21 A 31 0 A 51 A 21 A 22 0 A 42 0 A 31 0 A 33 0 A 53 0 A 42 0 A 44 A 54 A 51 0 A 53 A 54 A 55 sparsity pattern of symmetric n n matrix is set of nonzero positions E {{i, j} i, j {1, 2,..., n}} A has sparsity pattern E if A i j = 0 if i j and {i, j} E notation: A S n E represented by undirected graph (V, E) with edges E, vertices V = {1,..., n} clique (maximal complete subgraph) forms maximal dense principal submatrix 5

Symmetric sparsity pattern 1 2 5 A = 3 4 A 11 A 21 A 31 0 A 51 A 21 A 22 0 A 42 0 A 31 0 A 33 0 A 53 0 A 42 0 A 44 A 54 A 51 0 A 53 A 54 A 55 sparsity pattern of symmetric n n matrix is set of nonzero positions E {{i, j} i, j {1, 2,..., n}} A has sparsity pattern E if A i j = 0 if i j and {i, j} E notation: A S n E represented by undirected graph (V, E) with edges E, vertices V = {1,..., n} clique (maximal complete subgraph) forms maximal dense principal submatrix 5

Chordal graph undirected graph with vertex set V, edge set E {{v, w} v, w V } G = (V, E) a chord of a cycle is an edge between non-consecutive vertices G is chordal if every cycle of length greater than three has a chord a a f b f b e c e c d not chordal d chordal also known as triangulated, decomposable, rigid circuit graph,... 6

History chordal graphs have been studied in many disciplines since the 1960s combinatorial optimization (a class of perfect graphs) linear algebra (sparse factorization, completion problems) computer science (database theory) machine learning (graphical models, probabilistic networks) nonlinear optimization (partial separability) first used in semidefinite optimization by Fujisawa, Kojima, Nakata (1997) 7

Chordal sparsity and Cholesky factorization Cholesky factorization of positive definite A S n E : PAP T = LDL T P a permutation, L unit lower triangular, D positive diagonal if E is chordal, then there exists a permutation for which P T (L + L T )P S n E A has a zero fill Cholesky factorization if E is not chordal, then for every P there exist positive definite A S n E for which P T (L + L T )P S n E (Rose 1970) 8

Examples Simple patterns Sparsity pattern of a Cholesky factor : edges of non-chordal sparsity pattern : fill entries in Cholesky factorization a chordal extension of non-chordal pattern 9

Perfect elimination ordering ordered graph (V, E, σ): undirected graph, plus numbering σ of vertices a 4 e 2 f b 3 f b a 1 e c 6 c d 5 d higher neighborhood of vertex v: adj + (v) = {w {w, v} E, σ 1 (w) > σ 1 (v)} σ is perfect elimination ordering if all higher neighborhoods are complete perfect elimination ordering exists if and only if (V, E) is chordal (Fulkerson and Gross 1965) 10

Perfect elimination ordering ordered graph (V, E, σ): undirected graph, plus numbering σ of vertices a 4 e 2 f b 3 f b a 1 e c 6 c d 5 d higher neighborhood of vertex v: adj + (v) = {w {w, v} E, σ 1 (w) > σ 1 (v)} σ is perfect elimination ordering if all higher neighborhoods are complete perfect elimination ordering exists if and only if (V, E) is chordal (Fulkerson and Gross 1965) 10

Elimination tree 1 2 3 4 5 6 7 8 9 3 4 5 6 9 1 2 7 8 chordal graph with perfect elimination ordering σ = (1, 2,..., n) parent p(v) of vertex v in elimination tree is first vertex in adj + (v) 11

Expanded elimination tree 9 1 2 9 8 3 4 5 6 7 8, 9 7 7, 8 5 9 6 6, 9 4 8 9 5, 7, 8 2 5, 8 3 2, 7 1 each block represents a column: bottom row is v, top row is adj + (v) vertices of tree are complete subgraphs (and include all cliques) for each v, tree vertices that contain v form a subtree of elimination tree 12

Expanded elimination tree 9 1 2 9 8 3 4 5 6 7 8, 9 7 7, 8 5 9 6 6, 9 4 8 9 5, 7, 8 2 5, 8 3 2, 7 1 each block represents a column: bottom row is v, top row is adj + (v) vertices of tree are complete subgraphs (and include all cliques) for each v, tree vertices that contain v form a subtree of elimination tree 12

Supernodal elimination tree 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 8, 10 5, 6, 7 6, 7 4 13, 14, 15, 16, 17 16, 17 14, 15, 17 10 11, 12 10, 16 8, 9 9, 16 9, 10 3 1, 2 vertices of tree are cliques of sparsity graph top row of each block is intersection of clique with parent clique bottom rows are supernodes; form a partition of {1, 2,..., n} for each v, cliques that contain v form a subtree of elimination tree 13

Supernodal elimination tree 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 8, 10 5, 6, 7 6, 7 4 13, 14, 15, 16, 17 16, 17 14, 15, 17 10 11, 12 10, 16 8, 9 9, 16 9, 10 3 1, 2 vertices of tree are cliques of sparsity graph top row of each block is intersection of clique with parent clique bottom rows are supernodes; form a partition of {1, 2,..., n} for each v, cliques that contain v form a subtree of elimination tree 13

Outline 1. Chordal graphs 2. Decomposition of sparse matrix cones 3. Multifrontal algorithms for barrier functions 4. Minimum rank positive semidefinite completion

Sparse matrix cones we define two matrix cones in S n E (symmetric n n matrices with pattern E) positive semidefinite matrices S n + S n E = {X Sn E X 0} matrices with a positive semidefinite completion Π E (S n +) = {Π E (X) X 0} Π E is projection on S n E Properties two cones are convex closed, pointed, with nonempty interior (relative to S n E ) form a pair of dual cones (for the trace inner product) 14

Positive semidefinite matrices with chordal sparsity pattern A S n E is positive semidefinite if and only if it can be expressed as A = cliques γ i P T γ i H i P γi with H i 0 (for an index set β, P β is 0-1 matrix of size β n with P β x = x β for all x) = + + A 0 P T γ 1 H 1 P γ1 0 P T γ 2 H 2 P γ2 0 P T γ 3 H 3 P γ3 0 (Griewank and Toint 1984, Agler, Helton, McCullough, Rodman 1988) 15

Decomposition from Cholesky factorization example with two cliques: = H 1 + H 2 H 1 and H 2 follow by combining columns in Cholesky factorization = + readily computed from update matrices in multifrontal Cholesky factorization 16

Non-chordal example (Griewank and Toint 1984) graph with chordless cycle (1, 2,..., k, 1) of length k > 3 the following matrix A S n E is positive semidefinite for λ 2 cos(π/k): A = [ à 0 0 0 ], à = λ 1 0 0 ( 1) k 1 1 λ 1 0 0 0 1 λ 0 0........ 0 0 0 λ 1 ( 1) k 1 0 0 1 λ for λ < 2 there exists no decomposition à = λ u 1 0 ( 1) k 1 0 0 0...... ( 1) k 1 0 u k + u 1 1 0 1 λ u 2 0 0 0 0...... + with positive semidefinite terms 17

Partial separability a function f : R n R is partially separable in variables x i and x j (with i j) if f (x + se i + te j ) = f (x + se i ) + f (x + te j ) f (x) x R n, s, t R f is separable in the variables x i and x j if the other variables are held constant can be represented by an interaction graph with vertices V = {1, 2,..., n}, {i, j} E = f is partially separable in x i and x j Example: f (x) = f 1 (x 1, x 4, x 5 ) + f 2 (x 1, x 3 ) + f 3 (x 2, x 3 ) + f 4 (x 2, x 4 ) 3 1 2 4 5 (Griewank and Toint 1982, 1984) 18

Convex quadratic function interaction graph of f (x) = x T Ax is the sparsity graph of A if G is chordal, with l cliques γ i then there exist PSD matrices H i such that f (x) = x T ( l i=1 P T γ i H i P γi )x = l i=1 x T γ i H i x γi Example 4 2 5, 6 5 3, 4 3, 4 1 f (x) = + T x 1 x 1 x 3 H 1 x 3 x 4 + x 4 [ ] T [ ] x2 x2 H x 3 4 x 4 + T x 3 x 3 x 4 H 2 x 4 x 5 x 5 [ ] T [ ] x5 x5 H x 4 6 x 6 clique tree convex decomposition of f 19

Sum of squares a polynomial f : R n R is a sum of squares of degree 2d or less if f (x) = A α β x α x β = v(x) T Av(x) with A 0 α N n d β N n d x α with α = (α 1,..., α n ) denotes the monomial x α 1 1 xα 2 2 xα n n N n d is the set of nonnegative integer vectors (α 1,..., α n ) with i α i d ( ) n + d v(x) is vector of monomials in x of degree d or less d Sparse sums of squares: exploit partial separability to reduce size of A (Lasserre 2006, Kojima and Muramatsu 2009) 20

Sums of squares with chordal interaction graph f (x) = v(x) T Av(x) define a graph G = (V, E) with V = {1,..., n} and {i, j} E = f contains no monomials that depend on x i and x j (A α β = 0 if α i + β i > 0 and α j + β j > 0) sparsity pattern of A is chordal if G is chordal Example (n = 6, d = 2): clique tree of G and clique tree of sparsity graph of A 5, 6 5 3, 4 1, x 5, x 6, x 2 5, x 5x 6, x 2 6 1, x 5, x 2 5 x 3, x 4, x 2 3, x 3x 4, x 3 x 5, x 2 4, x 4x 5, x 2 5 4 2 3, 4 1 1, x 4, x 2 4 x 2, x 2 2, x 2x 4 1, x 3, x 4, x2 3, x 3x 4, x 2 4 x 1, x 2 1, x 1x 3, x 1 x 4 21

Clique decomposition clique decomposition gives more compact sum-of-squares representation v(x γi ) is the l f (x) = v(x γi ) T H i v(x γi ), H i 0 for i = 1,..., l i=1 ( γi + d d ) -vector with monomials in x γi of degree d or less Example (n = 6, d = 2): clique tree of G and clique tree of sparsity graph of A 4 2 5, 6 5 3, 4 3, 4 1 f (x) = v(x 1, x 3, x 4 ) T H 1 v(x 1, x 3, x 4 ) + v(x 3, x 4, x 5 ) T H 2 v(x 3, x 4, x 5 ) + v(x 2, x 4 ) T H 3 v(x 2, x 4 ) + v(x 5, x 6 ) T H 4 + v(x 5, x 6 ) clique tree sparse sum of squares decomposition of f 22

PSD-completable cone with chordal sparsity A S n E has a positive semidefinite completion if and only if A γi γ i 0 for all cliques γ i follows from duality and clique decomposition of positive semidefinite cone Example (three cliques γ 1, γ 2, γ 3 ) A γ1 γ 1 0 PSD completable A A γ2 γ 2 0 A γ3 γ 3 0 (Grone, Johnson, Sá, Wolkowicz, 1984) 23

Sparse semidefinite optimization minimize tr(c X) subject to tr(a i X) = b i, i = 1,..., m X 0 aggregate sparsity pattern E is union of patterns of C, A 1,..., A m without loss of generality, can assume E is chordal constraint X 0 can be replaced by X γi γ i 0 for all cliques γ i Decomposition algorithms replace X with smaller matrix variables X i = X γi γ i add equality constraints to enforce consistency of the variables X i first proposed for interior-point methods (Fukuda et al. 2000, Nakata et al. 2003) first-order (dual decomposition) methods (Lu et al. 2007, Lam et al. 2011,... ) 24

Example: sparse nearest matrix problems find nearest sparse PSD-completable matrix with given sparsity pattern minimize X A F 2 subject to X Π E (S+ n ) find nearest sparse PSD matrix with given sparsity pattern minimize subject to S + A 2 F S S n + Sn E these two problems are duals: K = Π E (S n + ) K = (S n + Sn E ) A 25

Decomposition methods from the decomposition theorems, the two problems can be written primal: minimize X A 2 F subject to X γi γ i 0, i = 1,..., l dual: minimize A + l i=1 P T γ i H i P γi 2 F subject to H i 0, i = 1,..., l Algorithms Dykstra s algorithm (dual block coordinate ascent) (fast) dual projected gradient algorithm (FISTA) Douglas-Rachford splitting, ADMM sequence of projections on PSD cones of order γ i (eigenvalue decomposition) 26

Results sparsity patterns from University of Florida Sparse Matrix Collection n density #cliques avg. clique size max. clique 20141 2.80e-3 1098 35.7 168 38434 1.25e-3 2365 28.1 188 57975 9.04e-4 8875 14.9 132 79841 9.71e-4 4247 44.4 337 114599 2.02e-4 7035 18.9 58 total runtime (sec) time/iteration (sec) n FISTA Dykstra DR FISTA Dykstra DR 20141 2.5e2 3.9e1 3.8e1 1.0 1.6 1.5 38434 4.7e2 4.7e1 6.2e1 2.1 1.9 2.5 57975 > 4hr 1.4e2 1.1e3 3.5 5.7 6.4 79841 2.4e3 3.0e2 2.4e2 6.3 7.6 9.7 114599 5.3e2 5.5e1 1.0e2 2.6 2.2 4.0 (Sun and Vandenberghe 2015) 27

Outline 1. Chordal graphs 2. Decomposition of sparse matrix cones 3. Multifrontal algorithms for barrier functions 4. Minimum rank positive semidefinite completion

Sparse SDP as nonsymmetric conic linear program minimize tr(c X) subject to tr(a i X) = b i, i = 1,..., m X 0 aggregate sparsity pattern E is union of patterns of C, A 1,..., A m Equivalent conic LP minimize tr(c X) subject to tr(a i X) = b i, i = 1,..., m X K K Π E (S+ n ) is cone of PSD completable matrices with pattern E optimization problem in a lower-dimensional space K is not self-dual; no symmetric interior-point methods 28

Barrier function for positive semidefinite cone φ(s) = log det S, dom φ = {S S n E S 0} gradient (negative projected inverse) φ(s) = Π E (S 1 ) requires entries of dense inverse S 1 on diagonal and for {i, j} E Hessian applied to sparse Y S n E : 2 φ(s)[y] = d dt φ(s + ty ) t=0 = Π E ( S 1 Y S 1) requires projection of dense product S 1 Y S 1 29

Multifrontal algorithms assume E is a chordal sparsity pattern (or chordal extension) Cholesky factorization (Duff and Reid 1983) factorization S = LDL T gives function value of barrier: φ(s) = i log D ii recursion on elimination tree in topological order Gradient (Campbell and Davis 1995, Andersen et al. 2013) compute φ(s) = Π E (S 1 ) from equation S 1 L = L T D 1 recursion on elimination tree in inverse topological order Hessian compute 2 φ(s)[y] = Π E (S 1 Y S 1 ) by linearizing recursion for gradient two recursions on elimination tree (topological and inverse topological order) 30

Projected inverse versus Cholesky factorization 10 6 Problem statistics 10 1 Time comparison Number of nonzeros in A 10 5 10 4 10 3 10 2 10 1 Projected inverse 10 0 10 1 10 2 10 3 10 4 10 0 10 0 10 1 10 2 10 3 10 4 10 5 10 6 10 5 10 5 10 4 10 3 10 2 10 1 10 0 10 1 Order n Cholesky factorization 667 patterns from University of Florida Sparse Matrix Collection time in seconds for projected inverse and Cholesky factorization code at github.com/cvxopt/chompack 31

Barrier for positive semidefinite completable cone φ (X) = sup S ( tr(x S) φ(s)), dom φ = {X = Π E (Y ) Y 0} this is the conjugate of the barrier φ(s) = log det S for the sparse PSD cone inverse Z = Ŝ 1 of optimal Ŝ is maximum determinant PD completion of X: maximize subject to log det Z Π E (Z) = X gradient and Hessian of φ at X are φ (X) = Ŝ, 2 φ (X) = 2 φ(ŝ) 1 for chordal E, efficient multifrontal algorithms for Cholesky factors of Ŝ, given X 32

Inverse completion versus Cholesky factorization 10 1 Inverse completion factorization 10 0 10 1 10 2 10 3 10 4 10 5 10 5 10 4 10 3 10 2 10 1 10 0 10 1 Cholesky factorization time for Cholesky factorization of inverse of maximum determinant PD completion 33

Nonsymmetric interior-point methods minimize tr(c X) subject to tr(a i X) = b i, i = 1,..., m X Π E (S n + ) can be solved by nonsymmetric primal or dual barrier methods logarithmic barriers for cone Π E (S n + ) and its dual cone Sn + Sn E : φ (X) = sup S ( tr(x S) + log det S ), φ(s) = log det S fast evaluation of barrier values and derivatives if pattern is chordal examples: linear complexity per iteration for band or arrow pattern code and numerical results at github.com/cvxopt/smcp (Fukuda et al. 2000, Burer 2003, Srijungtongsiri and Vavasis 2004, Andersen et al. 2010) 34

Sparsity patterns sparsity patterns from University of Florida Sparse Matrix Collection m = 200 constraints random data with 0.05% nonzeros in Ai relative to E 500 1000 500 500 1000 2000 1500 1000 1000 2000 3000 2500 4000 1500 1500 500 1000 1500 2000 500 1000 1500 2000 3000 500 1000 1500 2000 2500 3000 1000 2000 3000 4000 rs228 rs35 rs200 rs365 n = 1,919 n = 2,003 n = 3,025 n = 4,704 1000 2000 2000 2000 5000 4000 4000 3000 4000 5000 8000 10000 7000 1000 2000 3000 4000 5000 6000 7000 15000 8000 6000 6000 10000 6000 10000 20000 12000 25000 14000 2000 4000 6000 8000 10000 2000 4000 6000 8000 100001200014000 30000 5000 10000 15000 20000 25000 30000 rs1555 rs828 rs1184 rs1288 n = 7,479 n = 10,800 n = 14,822 n = 30,401 35

Results n DSDP SDPA SDPA-C SDPT3 SeDuMi SMCP 1919 1.4 30.7 5.7 10.7 511.2 2.3 2003 4.0 34.4 41.5 13.0 521.1 15.3 3025 2.9 128.3 6.0 33.0 1856.9 2.2 4704 15.2 407.0 58.8 99.6 4347.0 18.6 n DSDP SDPA-C SMCP 7479 22.1 23.1 9.5 10800 482.1 1812.8 311.2 14822 791.0 2925.4 463.8 30401 mem 2070.2 320.4 average time per iteration for different solvers SMCP uses nonsymmetric matrix cone approach (Andersen et al. 2010) 36

Outline 1. Chordal graphs 2. Decomposition of sparse matrix cones 3. Multifrontal algorithms for logarithmic barriers 4. Minimum rank positive semidefinite completion

Minimum rank PSD completion with chordal sparsity recall that A S n E has a positive semidefinite completion if and only if A γi γ i 0 for all cliques γ i A γ1 γ 1 0 PSD completable A A γ2 γ 2 0 A γ3 γ 3 0 the minimum rank PSD completion has rank equal to max rank(a γi γ i ) cliques γ i (Dancis 1992) 37

Two-block completion problem we first consider the simple two-block completion problem?? A = A 11 A 12 0 A 21 A 22 A 23 0 A 32 A 33 a completion exists if and only if C 1 = [ ] A11 A 12 A 21 A 22 0, C 2 = [ ] A22 A 23 A 32 A 33 0 we construct a PSD completion of rank r = max{rank(c 1 ), rank(c 2 )} 38

Two-block completion algorithm compute matrices U, V, Ṽ, W of column dimension r such that [ ] A11 A 12 A 21 A 22 = [ U V ] [ U V ] T, [ ] A22 A 23 A 32 A 33 = [ Ṽ W ] [ Ṽ W ] T since VV T = ṼṼ T, the matrices V and Ṽ have SVDs V = PΣQ T 1, Ṽ = PΣQT 2 hence V = ṼQ with Q = Q 2 Q T 1 an orthogonal r r matrix a completion of rank r is given by UQ T Ṽ W UQ T Ṽ W T = A 11 A 12 UQ T W T A 21 A 22 A 23 WQU T A 32 A 33 39

Minimum rank completion for general chordal pattern recursion over supernodal elimination tree in inverse topological order 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 8, 10 5, 6, 7 6, 7 4 13, 14, 15, 16, 17 16, 17 14, 15, 17 10 11, 12 10, 16 8, 9 9, 16 9, 10 3 1, 2 vertices of tree are cliques γ i top row of vertex i is intersection α i of clique γ i with parent clique bottom rows are supernodes ν i = γ i \ α i ; these form a partition of {1, 2,..., n} 40

Minimum rank completion for general chordal pattern we compute an n r matrix Y with Π E (YY T ) = A and r = max cliques γ i rank(a γi γ i ) the block rows Y νj are computed in inverse topological order hence Y α j is known when we compute Y νj Algorithm: enumerate cliques in inverse topological order at vertex γ j, compute matrices U j, V j with column dimension r such that [ ] Aνj ν j A νj α j A α j ν j A α j α j = [ Uj V j ] [ Uj V j ] T if j is the root of the supernodal elimination tree, set Y νj = U j otherwise, compute orthogonal Q such that V j = Y α j Q and set Y νj = U j Q T 41

Sparse semidefinite optimization minimize tr(c X) subject to tr(a i X) = b i, i = 1,..., m X 0 aggregate sparsity pattern E is union of patterns of C, A 1,..., A m any feasible X can be replaced by PSD completion of Π E (X) if E is chordal, there is a completion with rank bounded by size of largest clique proves exactness of some simple SDP relaxations useful for rounding solution of SDP relaxations to minimum rank solution 42

Summary Sparse matrix theory: PSD and PSD-completable matrices with chordal pattern decomposition of matrix cones as sum or intersection of simple cones fast algorithms for evaluating barrier functions and derivatives simple algorithms for maximum determinant and minimum rank completion simple algorithms for Euclidean distance matrix completion Sparse semidefinite optimization minimize tr(c X) subject to tr(a i X) = b i, i = 1,..., m X 0 decomposition and splitting methods nonsymmetric interior-point methods 43