Advances in Convex Optimization: Theory, Algorithms, and Applications

Similar documents
12. Interior-point methods

Lecture 9 Sequential unconstrained minimization

12. Interior-point methods

Determinant maximization with linear. S. Boyd, L. Vandenberghe, S.-P. Wu. Information Systems Laboratory. Stanford University

Interior Point Algorithms for Constrained Convex Optimization

Barrier Method. Javier Peña Convex Optimization /36-725

4. Convex optimization problems

POLYNOMIAL OPTIMIZATION WITH SUMS-OF-SQUARES INTERPOLANTS

Agenda. Interior Point Methods. 1 Barrier functions. 2 Analytic center. 3 Central path. 4 Barrier method. 5 Primal-dual path following algorithms

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization and l 1 -minimization

Lecture: Examples of LP, SOCP and SDP

Lecture: Convex Optimization Problems

Convex Optimization M2

Introduction to Convex Optimization

8. Geometric problems

Handout 6: Some Applications of Conic Linear Programming

Lecture 6: Conic Optimization September 8

A CONIC DANTZIG-WOLFE DECOMPOSITION APPROACH FOR LARGE SCALE SEMIDEFINITE PROGRAMMING

8. Geometric problems

Convex Optimization in Classification Problems

9. Geometric problems

Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization /36-725

CS295: Convex Optimization. Xiaohui Xie Department of Computer Science University of California, Irvine

Primal-Dual Interior-Point Methods. Javier Peña Convex Optimization /36-725

Example 1 linear elastic structure; forces f 1 ;:::;f 100 induce deections d 1 ;:::;d f i F max i, several hundred other constraints: max load p

On the Sandwich Theorem and a approximation algorithm for MAX CUT

The Q-parametrization (Youla) Lecture 13: Synthesis by Convex Optimization. Lecture 13: Synthesis by Convex Optimization. Example: Spring-mass System

Course Outline. FRTN10 Multivariable Control, Lecture 13. General idea for Lectures Lecture 13 Outline. Example 1 (Doyle Stein, 1979)

Handout 8: Dealing with Data Uncertainty

4. Convex optimization problems

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3

Lecture 1: Introduction

15. Conic optimization

FRTN10 Multivariable Control, Lecture 13. Course outline. The Q-parametrization (Youla) Example: Spring-mass System

Relaxations and Randomized Methods for Nonconvex QCQPs

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

w Kluwer Academic Publishers Boston/Dordrecht/London HANDBOOK OF SEMIDEFINITE PROGRAMMING Theory, Algorithms, and Applications

Convex relaxation. In example below, we have N = 6, and the cut we are considering

Disciplined Convex Programming

Acyclic Semidefinite Approximations of Quadratically Constrained Quadratic Programs

Interior Point Methods: Second-Order Cone Programming and Semidefinite Programming

10. Unconstrained minimization

Convex optimization problems. Optimization problem in standard form

CSCI 1951-G Optimization Methods in Finance Part 09: Interior Point Methods

Agenda. Applications of semidefinite programming. 1 Control and system theory. 2 Combinatorial and nonconvex optimization

Optimization in. Stephen Boyd. 3rd SIAM Conf. Control & Applications. and Control Theory. System. Convex

Convex relaxation. In example below, we have N = 6, and the cut we are considering

1 Robust optimization

Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

IE 521 Convex Optimization

ADVANCES IN CONVEX OPTIMIZATION: CONIC PROGRAMMING. Arkadi Nemirovski

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

1 Introduction Semidenite programming (SDP) has been an active research area following the seminal work of Nesterov and Nemirovski [9] see also Alizad

subject to (x 2)(x 4) u,

6-1 The Positivstellensatz P. Parrilo and S. Lall, ECC

Analysis and synthesis: a complexity perspective

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Convex Optimization: Applications

The Simplest Semidefinite Programs are Trivial

Supplement: Universal Self-Concordant Barrier Functions

Semidefinite Programming

Lecture 7: Convex Optimizations

Largest dual ellipsoids inscribed in dual cones

Fast Algorithms for SDPs derived from the Kalman-Yakubovich-Popov Lemma

Semidefinite and Second Order Cone Programming Seminar Fall 2012 Project: Robust Optimization and its Application of Robust Portfolio Optimization

Sparse Optimization Lecture: Basic Sparse Optimization Models

Semidefinite Programming Basics and Applications

6. Approximation and fitting

A Hierarchy of Polyhedral Approximations of Robust Semidefinite Programs

EE 227A: Convex Optimization and Applications October 14, 2008

Second-Order Cone Program (SOCP) Detection and Transformation Algorithms for Optimization Software

Analytic Center Cutting-Plane Method

Introduction to Semidefinite Programming I: Basic properties a

Newton s Method. Ryan Tibshirani Convex Optimization /36-725

A new primal-dual path-following method for convex quadratic programming

A direct formulation for sparse PCA using semidefinite programming

June Engineering Department, Stanford University. System Analysis and Synthesis. Linear Matrix Inequalities. Stephen Boyd (E.

Agenda. 1 Cone programming. 2 Convex cones. 3 Generalized inequalities. 4 Linear programming (LP) 5 Second-order cone programming (SOCP)

Applications of Linear Programming

Sparse Covariance Selection using Semidefinite Programming

ELE539A: Optimization of Communication Systems Lecture 6: Quadratic Programming, Geometric Programming, and Applications

Interior Point Methods for Mathematical Programming

Geometric problems. Chapter Projection on a set. The distance of a point x 0 R n to a closed set C R n, in the norm, is defined as

Example: feasibility. Interpretation as formal proof. Example: linear inequalities and Farkas lemma

Primal-Dual Geometry of Level Sets and their Explanatory Value of the Practical Performance of Interior-Point Methods for Conic Optimization

Convex Optimization. Lieven Vandenberghe Electrical Engineering Department, UCLA. Joint work with Stephen Boyd, Stanford University

A direct formulation for sparse PCA using semidefinite programming

c 2000 Society for Industrial and Applied Mathematics

Newton s Method. Javier Peña Convex Optimization /36-725

Distributionally Robust Convex Optimization

4. Algebra and Duality

A Tutorial on Convex Optimization II: Duality and Interior Point Methods

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

Constrained Optimization and Lagrangian Duality

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

Convex optimization. Javier Peña Carnegie Mellon University. Universidad de los Andes Bogotá, Colombia September 2014

Transcription:

Advances in Convex Optimization: Theory, Algorithms, and Applications Stephen Boyd Electrical Engineering Department Stanford University (joint work with Lieven Vandenberghe, UCLA) ISIT 02 ISIT 02 Lausanne 7/3/02

Two problems polytope P described by linear inequalities, a T i x b i, i =1,...,L P a 1 Problem 1a: find minimum volume ellipsoid P Problem 1b: find maximum volume ellipsoid P are these (computationally) difficult? or easy? ISIT 02 Lausanne 7/3/02 1

problem 1a is very difficult in practice in theory (NP-hard) problem 1b is very easy in practice (readily solved on small computer) in theory (polynomial complexity) ISIT 02 Lausanne 7/3/02 2

Two more problems find capacity of discrete memoryless channel, subject to constraints on input distribution Problem 2a: find channel capacity, subject to: no more than 30% of the probability is concentrated on any 10% of the input symbols Problem 2b: find channel capacity, subject to: at least 30% of the probability is concentrated on 10% of the input symbols are problems 2a and 2b (computationally) difficult? or easy? ISIT 02 Lausanne 7/3/02 3

problem 2a is very easy in practice & theory problem 2b is very difficult 1 1 I m almost sure ISIT 02 Lausanne 7/3/02 4

Moral very difficult and very easy problems can look quite similar... unless you re trained to recognize the difference ISIT 02 Lausanne 7/3/02 5

Outline what s new in convex optimization some new standard problem classes generalized inequalities and semidefinite programming interior-point algorithms and complexity analysis ISIT 02 Lausanne 7/3/02 6

Convex optimization problems minimize f 0 (x) subject to f 1 (x) 0,...,f L (x) 0, Ax = b x R n is optimization variable f i : R n R are convex, i.e., forallx,y,0 λ 1, f i (λx +(1 λ)y) λf i (x)+(1 λ)f i (y) examples: linear & (convex) quadratic programs problem 1b & 2a (if formulated properly) ISIT 02 Lausanne 7/3/02 7

Convex analysis & optimization nice properties of convex optimization problems known since 1960s local solutions are global duality theory, optimality conditions simple solution methods like alternating projections convex analysis well developed by 1970s (Rockafellar) separating & supporting hyperplanes subgradient calculus ISIT 02 Lausanne 7/3/02 8

What s new (since 1990 or so) powerful primal-dual interior-point methods extremely efficient, handle nonlinear large scale problems polynomial-time complexity results for interior-point methods based on self-concordance analysis of Newton s method extension to generalized inequalities semidefinite & maxdet programming new standard problem classes generalizations of LP, with theory, algorithms, software lots of applications control, combinatorial optimization, signal processing, circuit design,... ISIT 02 Lausanne 7/3/02 9

Recent history (1984 97) interior-point methods for LP (1984) Karmarkar s interior-point LP method theory (Ye, Renegar, Kojima, Todd, Monteiro, Roos,... ) practice (Wright, Mehrotra, Vanderbei, Shanno, Lustig,... ) (1988) Nesterov & Nemirovsky s self-concordance analysis (1989 ) semidefinite programming in control (Boyd, El Ghaoui, Balakrishnan, Feron, Scherer,... ) (1990 ) semidefinite programming in combinatorial optimization (Alizadeh, Goemans, Williamson, Lovasz & Schrijver, Parrilo,... ) (1994) interior-point methods for nonlinear convex problems (Nesterov & Nemirovsky, Overton, Todd, Ye, Sturm,... ) (1997 ) robust optimization (Ben Tal, Nemirovsky, El Ghaoui,... ) ISIT 02 Lausanne 7/3/02 10

Some new standard (convex) problem classes second-order cone programming (SOCP) semidefinite programming (SDP), maxdet programming (convex form) geometric programming (GP) for these new problem classes we have complete duality theory, similar to LP good algorithms, and robust, reliable software wide variety of new applications ISIT 02 Lausanne 7/3/02 11

Second-order cone programming second-order cone program (SOCP) has form minimize c T 0 x subject to A i x + b i 2 c T i x + d i, i =1,...,m Fx = g variable is x R n includes LP as special case (A i =0,b i =0), QP (c i =0) nondifferentiable when A i x + b i =0 new IP methods can solve (almost) as fast as LPs ISIT 02 Lausanne 7/3/02 12

Robust linear programming robust linear program: minimize c T x subject to a T i x b i for all a i E i ellipsoid E i = { a i + F i p p 2 1}describes uncertainty in constraint vectors a i x must satisfy constraints for all possible values of a i can extend to uncertain c & b i, correlated uncertainties... ISIT 02 Lausanne 7/3/02 13

Robust LP as SOCP robust LP is minimize subject to c T x a T i x +sup{(f i p) T x p 2 1} b i which is the same as minimize subject to c T x a T i x + Fi T x 2 b i an SOCP (hence, readily solved) term F T i x 2 is extra margin required to accommodate uncertainty in a i ISIT 02 Lausanne 7/3/02 14

Stochastic robust linear programming minimize c T x subject to Prob(a T i x b i) η, i =1,...,m where a i N(ā i, Σ i ), η 1/2 (c and b i are fixed) i.e., each constraint must hold with probability at least η equivalent to SOCP minimize c T x subject to ā T i x +Φ 1 (η) Σ 1/2 i x 2 1, i =1,...,m where Φ is CDF of N (0, 1) random variable ISIT 02 Lausanne 7/3/02 15

log-sum-exp function: Geometric programming lse(x) =log(e x 1 + +e x n )...a smooth convex approximation of the max function geometric program (GP), with variable x R n : minimize lse(a 0 x + b 0 ) subject to lse(a i x + b i ) 0, i =1,...,m where A i R m i n, b i R m i new IP methods can solve large scale GPs (almost) as fast as LPs ISIT 02 Lausanne 7/3/02 16

Dual geometric program dual of geometric program is an unnormalized entropy problem m ( maximize i=0 b T i ν i + entr(ν i ) ) subject to ν i 0, i =0,...,m, 1 T ν 0 =1, m i=0 AT i ν i =0 dual variables are ν i R m i, i =0,...,m (unnormalized) entropy is entr(ν) = n i=1 ν i log ν i 1 T ν GP is closely related to problems involving entropy, KL divergence ISIT 02 Lausanne 7/3/02 17

Example: DMC capacity problem x R n is distribution of input; y R m is distribution of output P R m n gives conditional probabilities: y = Px primal channel capacity problem: maximize where c j = m i=1 p ij log p ij c T x + entr(y) subject to x 0, 1 T x =1, y = Px dual channel capacity problem is a simple GP: minimize lse(u) subject to c + P T u 0 ISIT 02 Lausanne 7/3/02 18

Generalized inequalities with proper convex cone K R k we associate generalized inequality x K y y x K convex optimization problem with generalized inequalities: minimize f 0 (x) subject to f 1 (x) K1 0,...,f L (x) KL 0, Ax = b f i : R n R k i are K i -convex: for all x, y, 0 λ 1, f i (λx +(1 λ)y) Ki λf i (x)+(1 λ)f i (y) ISIT 02 Lausanne 7/3/02 19

Semidefinite program semidefinite program (SDP): minimize c T x subject to A 0 + x 1 A 1 + +x n A n 0, Cx = d A i = A T i R m m inequality is matrix inequality, i.e., K is positive semidefinite cone single constraint, which is affine (hence, matrix convex) ISIT 02 Lausanne 7/3/02 20

Maxdet problem extension of SDP: maxdet problem minimize c T x log det + (G 0 + x 1 G 1 + +x m G m ) subject to A 0 + x 1 A 1 + +x n A n 0, Cx = d x R n is variable A i = A T i R m m, G i = G T i R p p det + (Z) = { det Z if Z 0 0 otherwise ISIT 02 Lausanne 7/3/02 21

Semidefinite & maxdet programming nearly complete duality theory, similar to LP interior-point algorithms that are efficient in theory & practice applications in many areas: control theory combinatorial optimization & graph theory structural optimization statistics signal processing circuit design geometrical problems algebraic geometry ISIT 02 Lausanne 7/3/02 22

Chebyshev bounds generalized Chebyshev inequalities: lower bounds on Prob(X C) X R n is a random variable with E X = a, E XX T = S C is an open polyhedron C = {x a T i x<b i,i=1,...,m} cf. classical Chebyshev inequality on R if E X =0,EX 2 =σ 2 Prob(X <1) 1 1+σ 2 ISIT 02 Lausanne 7/3/02 23

Chebyshev bounds via SDP minimize 1 m i=1 λ i subject to a T i z i b i λ i, i =1,...,m [ ] [ ] m Zi z i S a i=1 zi T λ i a T 1 [ ] Zi z i zi T 0, i =1,...,m λ i an SDP with variables Z i = Z T i R n n, z i R n,andλ i R optimal value is a (sharp) lower bound on Prob(X C) can construct a distribution with E X = a, E XX T = S that attains the lower bound ISIT 02 Lausanne 7/3/02 24

Detection example x = s + v x R n : received signal s: transmitted signal s {s 1,s 2,...,s N } (one of N possible symbols) v: noise with E v =0,Evv T = I (but otherwise unknown distribution) detection problem: given observed value of x, estimate s ISIT 02 Lausanne 7/3/02 25

example (n =2,N=7) s 3 s 2 s 4 s 1 s 7 s 5 s 6 detector selects symbol s k closest to received signal x correct detection if s k + v lies in the Voronoi region around s k ISIT 02 Lausanne 7/3/02 26

example: bound on probability of correct detection of s 1 is 0.205 s 3 s 2 s 4 s 1 s 7 s 5 s 6 solid circles: distribution with probability of correct detection 0.205 ISIT 02 Lausanne 7/3/02 27

Boolean least-squares x { 1,1} n is transmitted; we receive y = Ax + v, v N(0,I) ML estimate of x found by solving boolean least-squares problem minimize Ax y 2 subject to x 2 i =1, i =1,...,n could check all 2 n possible values of x... an NP-hard problem many heuristics for approximate solution ISIT 02 Lausanne 7/3/02 28

Boolean least-squares as matrix problem where X = xx T Ax y 2 = x T A T Ax 2y T A T x + y T y = Tr A T AX 2y T A T x + y T y hence can express BLS as minimize Tr A T AX 2y T A T x + y T y subject to X ii =1, X xx T, rank(x) =1... still a very hard problem ISIT 02 Lausanne 7/3/02 29

SDP relaxation for BLS ignore rank one constraint, and use X xx T [ X x x T 1 ] 0 to obtain SDP relaxation (with variables X, x) minimize Tr A T AX 2y T A T x + y T y [ ] X x subject to X ii =1, x T 0 1 optimal value of SDP gives lower bound for BLS if optimal matrix is rank one, we re done ISIT 02 Lausanne 7/3/02 30

Stochastic interpretation and heuristic suppose X, x are optimal for SDP relaxation generate z from normal distribution N (x, X xx T ) take x = sgn(z) as approximate solution of BLS (can repeat many times and take best one) ISIT 02 Lausanne 7/3/02 31

Interior-point methods handle linear and nonlinear convex problems (Nesterov & Nemirovsky) based on Newton s method applied to barrier functions that trap x in interior of feasible region (hence the name IP) worst-case complexity theory: # Newton steps problem size in practice: # Newton steps between 5 & 50 (!) 1000s variables, 10000s constraints feasible on PC; far larger if structure is exploited ISIT 02 Lausanne 7/3/02 32

Log barrier for convex problem minimize f 0 (x) subject to f i (x) 0, i =1,...,m we define logarithmic barrier as φ(x) = m log( f i (x)) i=1 φ is convex, smooth on interior of feasible set φ as x approaches boundary of feasible set ISIT 02 Lausanne 7/3/02 33

Central path central path is curve x (t) = argmin x (tf 0 (x)+φ(x)), t 0 x (t) is strictly feasible, i.e., f i (x) < 0 x (t) can be computed by, e.g., Newton s method intuition suggests x (t) converges to optimal as t using duality can prove x (t) is m/t-suboptimal ISIT 02 Lausanne 7/3/02 34

Example: central path for LP x (t) = argmin x ( tc T x 6 i=1 log(b i a T i x) ) c x opt x (10) x (0) ISIT 02 Lausanne 7/3/02 35

a.k.a. path-following method Barrier method given strictly feasible x, t>0, µ>1 repeat 1. compute x := x (t) (using Newton s method, starting from x) 2. exit if m/t < tol 3. t := µt duality gap reduced by µ each outer iteration ISIT 02 Lausanne 7/3/02 36

Trade-off in choice of µ large µ means fast duality gap reduction (fewer outer iterations), but many Newton steps to compute x (t + ) (more Newton steps per outer iteration) total effort measured by total number of Newton steps ISIT 02 Lausanne 7/3/02 37

Typical example 10 2 GP with n =50variables, m = 100 constraints, m i =5 wide range of µ works well very typical behavior (even for large m, n) duality gap 10 0 10 2 10 4 10 6 µ= 150 µ=50 µ =2 0 20 40 60 80 100 120 Newton iterations ISIT 02 Lausanne 7/3/02 38

Effect of µ Newton iterations 140 120 100 80 60 40 20 0 0 20 40 60 80 100 120 140 160 180 200 µ barrier method works well for µ in large range ISIT 02 Lausanne 7/3/02 39

Typical effort versus problem dimensions 35 LPs with n =2mvbles, m constraints 100 instances for each of 20 problem sizes avg & std dev shown Newton iterations 30 25 20 15 10 1 10 2 10 3 m ISIT 02 Lausanne 7/3/02 40

Other interior-point methods more sophisticated IP algorithms primal-dual, incomplete centering, infeasible start use same ideas, e.g., central path, log barrier readily available (commercial and noncommercial packages) typical performance: 10 50 Newton steps (!) over wide range of problem dimensions, problem type, and problem data ISIT 02 Lausanne 7/3/02 41

Complexity analysis of Newton s method classical result: if f small, Newton s method converges fast classical analysis is local, and coordinate dependent need analysis that is global, and, like Newton s method, coordinate invariant ISIT 02 Lausanne 7/3/02 42

Self-concordance self-concordant function f (Nesterov & Nemirovsky, 1988): when restricted to any line, f (t) 2f (t) 3/2 f SC f (z )=f(tz) SC, for T nonsingular (i.e., SC is coordinate invariant) a large number of common convex functions are SC x log x log x, log det X 1, log(y 2 x T x),... ISIT 02 Lausanne 7/3/02 43

Complexity analysis of Newton s method for self-concordant functions for self-concordant function f, with minimum value f, theorem: #Newton steps to minimize f, starting from x: #steps 11(f(x) f )+5 empirically: #steps 0.6(f(x) f )+5 note absence of unknown constants, problem dimension, etc. ISIT 02 Lausanne 7/3/02 44

25 20 #iterations 15 10 5 0 0 5 10 15 20 25 30 35 f(x (0) ) f ISIT 02 Lausanne 7/3/02 45

Complexity of path-following algorithm to compute x (µt) starting from x (t), #steps 11m(µ 1 log µ)+5 using N&N s self-concordance theory, duality to bound f number of outer steps to reduce duality gap by factor α: log α/ log µ total number of Newton steps bounded by product, log α (11m(µ 1 log µ)+5) log µ... captures trade-off in choice of µ ISIT 02 Lausanne 7/3/02 46

Complexity analysis conclusions for any choice of µ, #steps is O(m log 1/ɛ), whereɛis final accuracy to optimize complexity bound, can take µ =1+1/ m, which yields #steps O( m log 1/ɛ) in any case, IP methods work extremely well in practice ISIT 02 Lausanne 7/3/02 47

Conclusions since 1985, lots of advances in theory & practice of convex optimization complexity analysis semidefinite programming, other new problem classes efficient interior-point methods & software lots of applications ISIT 02 Lausanne 7/3/02 48

Some references Semidefinite Programming, SIAM Review 1996 Determinant Maximization with Linear Matrix Inequality Constraints, SIMAX 1998 Applications of Second-order Cone Programming, LAA 1999 Linear Matrix Inequalities in System and Control Theory, SIAM 1994 Interior-point Polynomial Algorithms in Convex Programming, SIAM 1994, Nesterov & Nemirovsky Lectures on Modern Convex Optimization, SIAM 2001, Ben Tal & Nemirovsky ISIT 02 Lausanne 7/3/02 49

Shameless promotion Convex Optimization, Boyd & Vandenberghe to be published 2003 pretty good draft available at Stanford EE364 (UCLA EE236B) class web site as course reader ISIT 02 Lausanne 7/3/02 50