Matrix Rank Minimization with Applications Maryam Fazel Haitham Hindi Stephen Boyd Information Systems Lab Electrical Engineering Department Stanford University 8/2001 ACC 01
Outline Rank Minimization Problem (RMP) Examples Solution methods ACC 01 1
Rank Minimization Problem (RMP) minimize Rank X subject to X C, X R m n is the optimization variable; C is convex set RMP is difficult nonconvex problem (NP-hard) RMP arises in many application areas usual meaning: find simplest or minimum order system; model with fewest parameters... (Occam s razor) this talk: new methods to (approximately) solve RMP ACC 01 2
Maximum sparsity problem important special case of RMP: variable X = diag(x) then Rank X = Card(x), number of nonzero xi RMP reduces to finding the sparsest vector in convex set C: minimize Card(x) subject to x C meaning: find simplest model, design with fewest components, sparse signal representation; extract sparse signals from noise,... ACC 01 3
Constrained factor analysis find minimum rank covariance matrix close to measured ˆΣ, consistent with prior info minimize Rank Σ subject to Σ ˆΣ ɛ Σ = Σ T 0 Σ C Σ R n n is variable, C is prior information, ɛ is tolerance Rank Σ is number of factors that explain Σ without last constraint, can solve by eigenvalue decomposition; but general case is difficult ACC 01 4
Minimum order controller design find minimum order controller that achieves specified closed-loop performance [ minimize Rank subject to [ X I I Y X I I Y ] 0 ] other LMIs in X, Y with variables X = X T, Y = Y T R n n, possibly others ACC 01 5
Minimum order system realization find minimum order system that satisfies time-domain specs (rise-time, slew-rate, overshoot, settling-time,... ) h1 h2 hn minimize Rank h2 h3 hn+1... hn hn+1 h2n 1 subject to F (h1,..., hn) g variables are impulse response h1,..., h2n 1; linear inequality constraints involve only h1,..., hn ACC 01 6
Euclidean distance matrix (EDM) D R n n is EDM if there are x1,..., xn R r s.t. Dij = xi xj 2 r is called embedding dimension [Schoenberg 35] D = D T R n n is EDM with embedding dimension r iff Dii = 0, VDV 0, where V = I 1 n 11T, Rank VDV r ACC 01 7
Minimum embedding dimension find EDM D with minimum embedding dimension satisfying distance bounds minimize Rank VDV subject to Dii = 0 i = 1,..., n VDV 0 Lij Dij Uij i, j = 1,..., n. applications: statistics/psychometrics (called multidimensional scaling) chemistry (molecular conformation) ACC 01 8
Exact solution methods special cases with analytical solutions via SVD, EV, e.g., minimize Rank X subject to X A b solution: sum of r leading dyads in SVD of A, where σr > b, σr+1 b special cases that reduce to convex problems (e.g., [Mesbahi 97]) global optimization (e.g., branch and bound) ACC 01 9
Heuristic solution methods Factorization methods (e.g., [Iwasaki 99]) Rank X r iff there are F R m r, G R r n s.t. X = F G solve feasibility problem with variables F, G; use bisection on r Analytic anti-centering [David 94] use Newton method in reverse to maximize log barrier or potential function ACC 01 10
Alternating projections [Grigoriadis & Beran 00] alternate between projecting onto constraint set C via convex optimization projecting onto set of matrices of rank r via SVD x0 Rank X r PSfrag replacements C ACC 01 11
Trace heuristic for PSD matrices for X = X T 0, minimizing trace tends to give low rank solution [Mesbahi 97, Pare 00] RMP Trace heuristic minimize Rank X subject to X C minimize Tr X subject to X C convex problem, hence efficiently solved provides lower bound on min rank objective ACC 01 12
variation: weighted trace minimization minimize Tr W X subject to X C where W = W T 0 ACC 01 13
Log-det heuristic for PSD matrices for X = X T 0, (locally) minimize log det(x + δi) (δ > 0 is small constant for regularization) RMP Log-det heuristic minimize Rank X subject to X C minimize log det(x + δi) subject to X C objective is nonconvex (in fact, concave) can use any method to find a local minimum ACC 01 14
Idea behind log-det heuristic n n Rank X = 1(λi > 0) log det(x + δi) = log(λi + δ) i=1 i=1 PSfrag replacements 1 log(λi + δ) log δ λi ACC 01 15
Iterative linearization method linearize (concave) objective at Xk 0: log det(x + δi) log det(xk + δi) + Tr(Xk + δi) 1 (X Xk) minimize linearized objective (a convex problem): Xk+1 = argmin Tr(Xk + δi) 1 X X C i.e., iterative weighted trace minimization with X0 = I, first iteration same as trace heuristic only a few iterations needed (about 5 or 6) ACC 01 16
Semidefinite embedding question: can we extend trace & log-det heuristics to general (nonsquare, non PSD) matrices? let X R m n then Rank X r iff there are Y = Y T R m m, Z = Z T R n n, s.t. [ Rank Y 0 0 Z ] 2r, [ Y X X T Z ] 0 thus, can embed general (non PSD) RMP in a (larger) PSD RMP ACC 01 17
RMP in embedded PSD form minimize Rank X subject to X C equivalent to PSD RMP [ minimize Rank [ subject to Y X X T Z X C, Y 0 0 Z ] ] 0 with variables X R m n, Y = Y T R m m, Z = Z T R n n can now apply any method for symmetric PSD RMP ACC 01 18
Trace heuristic for general matrices [ minimize Tr [ subject to Y 0 0 Z Y X X T Z X C ] ] 0 can show this is equivalent to minimize X subject to X C, where X = n i=1 σ i(x), called nuclear norm of X, is dual of spectral (maximum singular value) norm ACC 01 19
Convex envelope convex envelope of f : C R is largest convex function g s.t. g(x) f(x) for all x C PSfrag replacements f(x) g(x) best convex lower approximation epigraph of g is convex hull of epigraph of f ACC 01 20
Convex envelope of rank fact: X is cvx envelope of Rank X on {X R m n X 1} conclusions: trace heuristic minimizes convex envelope of rank (i.e., the best convex approximation to rank) over ball hence, heuristic provides lower bound on objective provides theoretical support for use of trace/nuclear norm heuristic ACC 01 21
Maximum sparsity problem nuclear norm heuristic for diagonal X becomes minimize x 1 subject to x C well-known l1 heuristic for finding sparse solutions used in LASSO methods in statistics [Tibshirani 94], signal decomposition by basis pursuit [Donoho 96],... x 1 is convex envelope of Card(x) on {x x 1} trace/nuclear norm heuristic is extension of l1 heuristic to matrix case ACC 01 22
Log-det heuristic for general matrices ([ Y 0 minimize log det 0 Z [ ] Y X subject to X T 0 Z X C ] + δi ) can show this is equivalent to n minimize i=1 log(σ i(x) + δ) subject to X C can linearize as before to obtain iterations in X, Y, Z ACC 01 23
Iterative l1 heuristic for maximum sparsity problem log-det heuristic for maximum sparsity problem yields minimize i log( x i + δ) subject to x C. iterative linearization/minimization yields x (k+1) = argmin x C n w (k) i xi, w (k) 1 i = x (k) i + δ i=1 each step is weighted l1 norm minimization when x (k) i small, weight in next step is large; hence, small entries in x are pushed towards zero (subject to x C) ACC 01 24
Example: minimum order system realization with step response constraints find minimum order system that satisfies li si ui, i = 1,..., 16 n = 16 trace/nuclear norm heuristic yields rank 5 log-det heuristic converges in 5 steps, yields rank 4 step response (solid) and specs (dashed) 1.2 1 0.8 0.6 PSfrag replacements 0.4 0.2 0 0.2 0 5 10 15 20 25 30 35 t ACC 01 25
ACC 01 26 iterations 10 1 1.5 2 2.5 3 3.5 4 4.5 5 PSfrag replacements log of non-zero σ i s 2 3 4 5 6 7 8 9 1 0
Conclusions RMP is difficult nonconvex problem, with many applications maximum sparsity problem is special case trace and log-det heuristics for PSD RMP can be extended to general case, via semidefinite embedding generalization of trace heuristic to general matrices is nuclear norm, which is convex envelope of rank ACC 01 27