Determinant maximization with linear. S. Boyd, L. Vandenberghe, S.-P. Wu. Information Systems Laboratory. Stanford University

Determinant maximization with linear matrix inequality constraints S. Boyd, L. Vandenberghe, S.-P. Wu Information Systems Laboratory Stanford University SCCM Seminar 5 February 1996 1

MAXDET problem denition minimize c T x + log det G(x),1 subject to G(x) = G 0 + x 1 G 1 + +x m G m > 0 F(x) =F 0 +x 1 F 1 ++x m F m 0 { x2r m is variable { G i = G T i 2 R ll, F i = F T i 2 R nn { F (x) 0, G(x) > 0 called linear matrix inequalities { looks specialized, but includes wide variety of convex optimization problems { convex problem { tractable, in theory and practice { useful duality theory 2

Outline 1. examples of MAXDET probems 2. duality theory 3. interior-point methods 3

Special cases of MAXDET semidenite program (SDP) minimize c T x subject to F (x) =F 0 +x 1 F 1 ++x m F m 0,c r x opt F (x) 6 0 F (x) 0 LMI can represent many convex constraints linear inequalities, convex quadratic inequalities, matrix norm constraints,... linear program minimize c T x subject to a T x b i i; i =1;:::;n SDP with F (x) =diag (b, Ax) 4

analytic center of LMI minimize log det F (x),1 subject to F (x) = F 0 + x 1 F 1 + +x m F m > 0 { log det F (x),1 smooth, convex on fx j F (x) > 0g { optimal point x ac maximizes det F (x) { x ac called analytic center of LMI F (x) > 0 rx ac 5

Minimum volume ellipsoid around points nd min vol ellipsoid containing points x 1,..., x K 2 R n E ellipsoid E = fx jkax, bk 1g { center A,1 b { A = A T > 0, volume proportional to det A,1 minimize log det A,1 subject to A = A T > 0 kax i, bk 1; convex optimization problem in A, b (n + n(n +1)=2 vars) express constraints as LMI kax i, bk 1() 2 6 4 I i=1;:::;k Ax i, b (Ax i, b) T 1 3 7 5 0 6

Maximum volume ellipsoid in polytope nd max vol ellips. in P = fx j a T i x b i; i =1;:::;Lg P @ @R E d s ellipsoid E = fby + d jkyk1g { center d { B = B T > 0, volume proportional to det B EP ()a T i (By + d) b i for all kyk 1 () sup a T By i + atd b i i kyk1 () kba i k + a T i d b i; i =1;:::;L convex constraint in B and d 7

maximum volume EP formulation as convex problem in variables B, d: maximize subject to B = B T > 0 kba i k + a Td b i i; i =1;:::;L log det B express constraints as LMI in B, d kba i k + a T i d b i () 2 6 4 (b i, a Td)I i Ba i (Ba i ) T b i, a Td i 3 7 5 0 hence, formulation as MAXDET-problem minimize log det B,1 subject to B>0 2 6 4 (b i,a Td)I i Ba i (Ba i ) T b i, a Td i 3 7 5 0; i =1;:::;L 8

Experiment design estimate x from measurements y k = a T k x + w k; i =1;:::;N { a k 2fv 1 ;:::;v m g, v i given test vectors { w k IID N(0; 1) measurement noise { i = fraction of a k 's equal to v i { N m LS estimator: c x = error covariance 0 B @ N X k=1 E( c x, x)( c x, x) T = 1 N a k a T k 1 C A,1 NX i=1 0 B X @ m i v i v T i i=1 y k a k 1 C A,1 = 1 N E() optimal experiment design: choose i i 0; that make E() `small' mx i=1 i =1; { minimize max (E()) (E-optimality) { minimize Tr E() (A-optimality) { minimize det E() (D-optimality) all are MAXDET problems 9

D-optimal design minimize log det subject to i 0; mx i=1 mx i=1 i =1 0 B X @ m i v i v T i i=1 i v i v T i > 0 1 C A,1 i =1;:::;m can add other convex constraints, e.g., { bounds on cost or time of measurements: c T i b i { no more than 80% of the measurements is concentrated in less than 20% of the test vectors bm=5c X i=1 [i] 0:8 ( [i] is ith largest component of ) 10

Positive denite matrix completion matrix A = A T { entries A ij, (i; j) 2N are xed { entries A ij, (i; j) 62 N are free positive denite completion choose free entries such that A>0(if possible) maximum entropy completion maximize subject to A>0 log det A property: (A,1 ) ij =0for i; j 62 N (since @ log det A,1 @A ij =,(A,1 ) ij ) 11

Moment problem there exists a probability distribution on R such that i = Et i ; i =1;:::;2n if and only if H() = 2 6 6 6 4 1 1 ::: n,1 n 1 2 ::: n n+1.... n,1 n ::: 2n,2 2n,1 n n+1 ::: 2n,1 2n 3 7 7 7 5 0 LMI in variables i hence, can solve maximize/minimize E(c 0 + c 1 t + +c 2n t 2n ) subject to i Et i i ; i =1;:::;2n over all probability distributions on R by solving SDP maximize/minimize c 0 + c 1 1 + +c 2n 2n subject to i i i ; i =1;:::;2n H( 1 ;:::; 2n ) 0 12

Other applications { maximizing products of positive concave functions { minimum volume ellipsoid covering union or sum of ellipsoids { maximum volume ellipsoid in intersection or sum of ellipsoids { computing channel capacity in information theory { maximum likelihood estimation 13

MAXDET duality theory primal MAXDET problem minimize c T x + log det G(x),1 subject to G(x) =G 0 +x 1 G 1 ++x m G m > 0 F(x)=F 0 +x 1 F 1 ++x m F m 0 optimal value p? dual MAXDET problem maximize log det W, Tr G 0 W, Tr F 0 Z + l subject to Tr F i Z + Tr G i W = c i ; i =1;:::;m W >0; Z 0 variables W = W T 2 R ll, Z = Z T 2 R nn optimal value d? properties { p? d? (always) { p? = d? (usually) denition duality gap = primal objective, dual objective 14

Example: experiment design primal problem minimize subject to log det m X i=1 i =1 i 0; mx i=1 0 B X @ m i v i v T i i=1 i v i v T i > 0 1 C A,1 i=1;:::;m dual problem maximize log det W subject to W = W T > 0 v T i Wv i 1; i =1;:::;m interpretation: W determines smallest ellipsoid with center at the origin and containing v i, i =1;:::;m 15

Central path: general general convex optimization problem f 0 ;C convex minimize f 0 (x) subject to x 2 C ' is barrier function for C { smooth, convex { '(x)!1as x(2 int C)! @C central path x? (t) =argmin x2c (tf 0 (x) +'(x)) for t>0 16

Central path: MAXDET problem f 0 (x) = c T x + log det G(x),1 C = fx j F (x) 0g barrier function for LMI F (x) 0 '(x) = 8 >< >: log det F (x),1 if F (x) > 0 +1 otherwise MAXDET central path: x? (t) =argmin F (x) > 0 G(x) > 0 '(t; x), with '(t; x) =t c T x+ log det G(x),1 + log det F (x),1 example: SDP t =0 r x ac r t = 1,c c T x = p? 17

Path-following for MAXDET properties of MAXDET central path { from x? (t), get dual feasible Z? (t), W? (t) { corresponding duality gap is n=t { x? (t)! optimal as t!1 path-following algorithm given strictly feasible x, t 1 repeat 1. compute x? (t) using Newton's method 2. x := x? (t) 3. increase t until n=t < tol tradeo: large increase in t means { fast gap reduction (fewer outer iterations), but { many Newton steps to compute x? (t + ) (more Newton steps per outer iteration) 18

# Newton steps Complexity of Newton's method for self-concordant functions denition: along a line Example: (K =2) (Nesterov & Nemirovsky, late 1980s) jf 000 (t)j Kf 00 (t) 3=2 '(t; x) =t(c T x+log det G(x),1 )+log det F (x),1 (t 1) complexity of Newton's method { theorem: #Newton steps to minimize '(t; x), starting from x (0) : #steps 10:7('(t; x (0) ), '? (t)) + 5 { empirically: #steps ('(t; x (0) ), '? (t)) + 3 30 25 20 15 10 5 0 0 5 10 15 20 25 30 '(t; x (0) ), '? (t) 19

Path-following algorithm idea: choose t +, starting point c x for Newton alg. s.t. '(t + ; c x), '? (t + )= (bounds # Newton steps required to compute x? (t + )) in practice: use lower bound from duality '(t + ; x) c, '? (t + ) '(t + ; x) c + log det Z,1 + t log det W,1 + Tr G 0 W + Tr F 0 Z, l = '(t + ; x)+function c of W;Z 20

duality gap duality gap two extreme choices { xed reduction: c x = x? (t), t + = 1+ r 2=n t { predictor step along tangent of central path x? (t) x? (t + ) x? x? bx x? (t + ) x? (t) 10 0 10 1 =10 xed reduction 10 0 10 1 =50 xed reduction 10 2 10 3 10 4 10 5 10 6 10 7 predictor 10 2 10 3 10 4 10 5 10 6 10 7 predictor 10 8 10 8 10 9 0 5 10 15 20 25 30 35 40 Newton iterations 10 9 0 5 10 15 20 25 30 35 40 Newton iterations 21

Newton iterations Newton iterations Newton iterations Total complexity total number of Newton steps { upper bound: O p ( n log(1=)) { practice, xed-reduction method: O p ( n log(1=)) { practice, with predictor steps: O (log(1=)) 50 45 40 35 30 25 20 15 10 5 xed reduction predictor steps 50 45 40 35 30 25 20 15 10 5 0 0 5 10 15 20 25 30 35 40 45 50 n 50 45 40 35 30 25 20 15 10 5 0 0 5 10 15 20 25 30 35 40 45 50 l 0 0 5 10 15 20 25 30 35 40 45 50 m one Newton step involves a least-squares problem minimize F ~ 2 (v) + ~ G(v) F 2 F 22

Conclusion MAXDET-problem minimize c T x + log det G(x),1 subject to G(x) > 0; F (x) 0 arises in many dierent areas { includes SDP, LP, convex QCQP { geometrical problems involving ellipsoids { experiment design, max. likelihood estimation, channel capacity,... convex, hence can be solved very eciently software/paper available on ftp soon (anonymous ftp to isl.stanford.edu in /pub/boyd/maxdet) 23