Lecture 7: Convex Optimizations

Similar documents
Lecture 9: Laplacian Eigenmaps

Lecture: Duality of LP, SOCP and SDP

Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

5. Duality. Lagrangian

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

ICS-E4030 Kernel Methods in Machine Learning

Convex Optimization & Lagrange Duality

Convex Optimization Boyd & Vandenberghe. 5. Duality

Lecture: Duality.

Lecture 1: Graphs, Adjacency Matrices, Graph Laplacian

Convex Optimization M2

Tutorial on Convex Optimization for Engineers Part II

Constrained Optimization and Lagrangian Duality

Lecture 6: Conic Optimization September 8

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Lecture 18: Optimization Programming

CS-E4830 Kernel Methods in Machine Learning

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Semidefinite Programming Basics and Applications

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

12. Interior-point methods

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

Lagrangian Duality and Convex Optimization

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725

CSCI : Optimization and Control of Networks. Review on Convex Optimization

A Brief Review on Convex Optimization

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Tutorial on Convex Optimization: Part II

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

Lecture 4: Phase Transition in Random Graphs

Optimization for Machine Learning

12. Interior-point methods

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

EE/AA 578, Univ of Washington, Fall Duality

Convex Optimization and Modeling

Applications of Linear Programming

On the Method of Lagrange Multipliers

Lecture 1: From Data to Graphs, Weighted Graphs and Graph Laplacian

Homework Set #6 - Solutions

Duality. Geoff Gordon & Ryan Tibshirani Optimization /

Support Vector Machines: Maximum Margin Classifiers

Support vector machines

The Lagrangian L : R d R m R r R is an (easier to optimize) lower bound on the original problem:

4TE3/6TE3. Algorithms for. Continuous Optimization

Lecture Notes on Support Vector Machine

Math 273a: Optimization Subgradients of convex functions

Lecture Note 5: Semidefinite Programming for Stability Analysis

subject to (x 2)(x 4) u,

EE 227A: Convex Optimization and Applications October 14, 2008

Constrained Optimization

Lectures 9 and 10: Constrained optimization problems and their optimality conditions

Linear and non-linear programming

Optimization. Yuh-Jye Lee. March 28, Data Science and Machine Intelligence Lab National Chiao Tung University 1 / 40

Optimality, Duality, Complementarity for Constrained Optimization

Optimality Conditions for Constrained Optimization

Lagrangian Duality Theory

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

Lecture 9 Sequential unconstrained minimization

Convex Optimization and l 1 -minimization

ELE539A: Optimization of Communication Systems Lecture 6: Quadratic Programming, Geometric Programming, and Applications

4. Algebra and Duality

4. Convex optimization problems

HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.

Constrained optimization

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

Part IB Optimisation

CS711008Z Algorithm Design and Analysis

Lecture 8. Strong Duality Results. September 22, 2008

Additional Homework Problems

Interior Point Algorithms for Constrained Convex Optimization

Agenda. 1 Cone programming. 2 Convex cones. 3 Generalized inequalities. 4 Linear programming (LP) 5 Second-order cone programming (SOCP)

Lecture: Introduction to LP, SDP and SOCP

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 6 Fall 2009

Semidefinite Programming

A Tutorial on Convex Optimization II: Duality and Interior Point Methods

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009

Convex Optimization Overview (cnt d)

Convex Optimization and SVM

Interior Point Methods. We ll discuss linear programming first, followed by three nonlinear problems. Algorithms for Linear Programming Problems

minimize x subject to (x 2)(x 4) u,

4. Convex optimization problems

Algorithms for nonlinear programming problems II

Numerical Optimization

CSCI 1951-G Optimization Methods in Finance Part 09: Interior Point Methods

Convex Optimization and Modeling

Sparse Optimization Lecture: Dual Certificate in l 1 Minimization

Algorithms for nonlinear programming problems II

Support Vector Machines

Homework 4. Convex Optimization /36-725

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Module 04 Optimization Problems KKT Conditions & Solvers

Lecture 13: Constrained optimization

EE364b Homework 2. λ i f i ( x) = 0, i=1

Lecture 3. Optimization Problems and Iterative Algorithms

Convex Optimization. (EE227A: UC Berkeley) Lecture 6. Suvrit Sra. (Conic optimization) 07 Feb, 2013

EE364 Review Session 4

CO 250 Final Exam Guide

Example: feasibility. Interpretation as formal proof. Example: linear inequalities and Farkas lemma

Transcription:

Lecture 7: Convex Optimizations Radu Balan, David Levermore March 29, 2018

Convex Sets. Convex Functions A set S R n is called a convex set if for any points x, y S the line segment [x, y] := {tx + (1 t)y, 0 t 1} is included in S, [x, y] S. A function f : S R is called convex if for any x, y S and 0 t 1, f (t x + (1 t)y) t f (x) + (1 t)f (y). Here S is supposed to be a convex set in R n. Equivalently, f is convex if its epigraph is a convex set in R n+1. Epigraph: {(x, u) ; x S, u f (x)}. A function f : S R is called strictly convex if for any x y S and 0 < t < 1, f (t x + (1 t)y) < t f (x) + (1 t)f (y).

Convex Optimization Problems The general form of a convex optimization problem: min f (x) x S where S is a closed convex set, and f is a convex function on S. Properties: 1 Any local minimum is a global minimum. The set of minimizers is a convex subset of S. 2 If f is strictly convex, then the minimizer is unique: there is only one local minimizer. In general S is defined by equality and inequality constraints: S = {g i (x) 0, 1 i p} {h j (x) = 0, 1 j m}. Typically h j are required to be affine: h j (x) = a T x + b.

Convex Programs The hiarchy of convex optimization problems: 1 Linear Programs: Linear criterion with linear constraints 2 Quadratic Problems: Quadratic Programs (QP); Quadratically Constrained Quadratic Problems (QCQP); Second-Order Cone Program (SOCP) 3 Semi-Definite Programs(SDP) Popular approach/solution: Primal-dual interior-point using Newton s method (second order)

Linear Programs LP and standard LPs Linear Program: minimize subject to where G R m n, A R p n. c T x + d Gx h Ax = b

Linear Programs LP and standard LPs Linear Program: minimize subject to where G R m n, A R p n. c T x + d Gx h Ax = b Standard form LP: minimize subject to c T x Ax = b x 0 Inequality form LP: minimize subject to c T x Ax b

Linear Program: Example Basis Pursuit Cosnider a system of linear equations: Ax = b with more columns (unknowns) than rows (equations), i.e. A is fat. We want to find the sparsest solution minimize x 0 subject to Ax = b where x 0 denotes the number of nonzero entries in x (i.e. the support size). This is a non-convex, NP-hard problem. Instead we solve its so-called convexification : minimize x 1 subject to Ax = b where x 1 = n k=1 x k. It is shown that, under some sonditions (RIP) the solutions of the two problems coincide.

Linear Program: Example Basis Pursuit How to turn: into a LP? minimize x 1 subject to Ax = b

Linear Program: Example Basis Pursuit How to turn: minimize x 1 subject to Ax = b into a LP? Method 1. Use the following auxiliary variables: y = (y k ) 1 k n so that x k y k, or x k y k 0, x k y k 0: minimize subject to 1 T y Ax = b x y 0 x y 0 y 0

Linear Program: Example Basis Pursuit How to turn: into a LP? minimize x 1 subject to Ax = b

Linear Program: Example Basis Pursuit How to turn: minimize x 1 subject to Ax = b into a LP? Method 2. Use the following substitutions (positive and negative parts): x k = u k v k, x k = u k + v k, with u k, v k 0: minimize subject to 1 T u + 1 T v A(u v) = b u 0 v 0

Quadratic Problems QP: Quadratic Programs 1 minimize 2 x T PX + q T x + r subject to Gx h Ax = b where P = P T 0, P R n n, G R m n, A R p n.

Quadratic Problems QP: Quadratic Programs 1 minimize 2 x T PX + q T x + r subject to Gx h Ax = b where P = P T 0, P R n n, G R m n, A R p n. Example: Constrained Regression (constrained least-squares). Typical LS problem: min Ax b 2 2 = min x T AX 2b T Ax + b T b has solution: x = A b = (A T A) 1 A T b. Constrained least-squares: minimize Ax b 2 2 subject to l i x i u i, i = 1,, n

Quadratic Problems QCQP: Quadratically Constrained Quadratic Programs 1 minimize 2 x T PX + q T x + r 1 subject to 2 x T P i x + qi T x + r i 0, i = 1,, m Ax = b where P = P T 0, P i = P T i 0, i = 1,, m, P, P i R n n, A R p n.

Quadratic Problems QCQP: Quadratically Constrained Quadratic Programs 1 minimize 2 x T PX + q T x + r 1 subject to 2 x T P i x + qi T x + r i 0, i = 1,, m Ax = b where P = P T 0, P i = P T i 0, i = 1,, m, P, P i R n n, A R p n. Remark 1. QP can be solved by QCQP: set P i = 0. 2. Criterion can always be recast in a linear form with unknown [x 0 ; x]: minimize x 0 1 subject to 2 x T Px + q T x x 0 + r 0 1 2 x T P i x + qi T x + r i 0, i = 1,, m Ax = b

Problem: For p = 2 recast it as a SOCP. For p = 1 it is a LP. Quadratic Problems SOCP: Second-Order Cone Programs minimize f T x subject to A i x + b i 2 ci T x + d i, i = 1,, m Fx = g where A i R n i n, F R p n. SOCP is the most general form of a quadratic problem. QCQP: c i = 0 except for i = 0. Example The placement problem: Given a set of weights {w ij } and of fixed points {x 1,, x L }, find the set of points {x L+1,, x N } that minimize forp 1: min 1 i<j N w ij x i x j p.

Semi-Definite Programs SDP and standard SDP Semi-Definite Program with unknown x R n : minimize c T x subject to x 1 F 1 + + x n F n + G 0 Ax = b where G, F 1,, F n are k k symmetric matrices in S k, A R p n.

Semi-Definite Programs SDP and standard SDP Semi-Definite Program with unknown x R n : minimize c T x subject to x 1 F 1 + + x n F n + G 0 Ax = b where G, F 1,, F n are k k symmetric matrices in S k, A R p n. Standard form SDP: minimize subject to trace(cx) trace(a i X) = b i X = X T 0 Inequality form SDP: minimize c T x subject to x 1 A 1 + x n A n B

CVX Matlab package Downloadable from: http://cvxr.com/cvx/. Follows Disciplined Convex Programming à la Boyd [2]. m = 20; n = 10; p = 4; A = randn(m,n); b = randn(m,1); C = randn(p,n); d = randn(p,1); e = rand; cvx_begin variable x(n) minimize( norm( A * x - b, 2 ) ) subject to C * x == d norm( x, Inf ) <= e cvx_end min Cx = d x e Ax b

CVX SDP Example cvx_begin sdp cvx_end variable X(n,n) semidefinite; minimize trace(x); subject to abs(trace(e1*x)-d1)<=epsx; abs(trace(e2*x)-d2)<=epsx; minimize subject to trace(x) trace(e 1 X) d 1 ε trace(e 2 X) d 2 ε X = X T 0

Dual Problem Lagrangian Primal Problem: p = minimize f 0 (x) subject to f i (x) 0, i = 1,, m h i (x) = 0, i = 1,, p Define the Lagrangian, L : R n R m R p R: m p L(x, λ, ν) = f 0 (x) + λ i f i (x) + ν i h i (x) i=1 i=1 Variables λ R m and ν R p are called dual variables, or Lagrange multipliers.

Dual Problem Lagrange Dual Function The Lagrange dual function (or the dual function) is given by: g(λ, ν) = inf L(x, λ, ν) = inf x D x D ( ) m p f 0 (x) + λ i f i (x) + ν i h i (x) i=1 i=1 where D R n is the domain of definition of all functions. Remark 1. Key estimate: For any λ 0, and any ν, g(λ, ν) p because g(λ, ν) = L(x, λ, ν) = f 0 (x ) + λ T f (x ) + ν T h(x ) f 0 (x ). 2. g(λ, ν) is a concave function regardless of problem convexity.

Dual Problem The dual problem The dual problem is given by: d = maximize g(λ, ν) subject to λ 0

Dual Problem The dual problem The dual problem is given by: d = maximize g(λ, ν) subject to λ 0 Remark 1. The dual problem is always a convex optimization problem (maximization of a concave function, with convex constraints). 2. We always have weak duality: d p

Dual Problem The duality gap. Strong duality The duality gap is p d. If the primal is a convex optimization problem: p = minimize f 0 (x) subject to f i (x) 0, i = 1,, m Ax = b and the Slater s constraint qualification condition holds: there is a feasible x relint(d) so that f i (x) < 0, i = 1,, m, then the strong duality holds: d = p. (Slater s condition is a sufficient, not a necessary condition.)

The Karush-Kuhn-Tucker (KKT) Conditions Necessary Conditions Assume f 0, f 1,, f m, h 1,, h p are differentiable with open domains. Assume x and (λ, ν ) be any primal and dual optimal points with zero duality gap. It follows that L(x, λ, ν ) x=x = 0 and g(λ, ν ) = L(x, λ, ν ) = f 0 (x ). We obtain the following set of equations called the KKT conditions: f i (x ) 0, i = 1,, m h i (x ) = 0, i = 1,, p λ i 0, i = 1,, m λ i f i (x ) = 0, i = 1,, m m p f 0 (x ) + λ i f i (x ) + νi h i (x ) = 0 i=1 i=1

The Karush-Kuhn-Tucker (KKT) Conditions Necessary Conditions Assume f 0, f 1,, f m, h 1,, h p are differentiable with open domains. Assume x and (λ, ν ) be any primal and dual optimal points with zero duality gap. It follows that L(x, λ, ν ) x=x = 0 and g(λ, ν ) = L(x, λ, ν ) = f 0 (x ). We obtain the following set of equations called the KKT conditions: f i (x ) 0, i = 1,, m h i (x ) = 0, i = 1,, p λ i 0, i = 1,, m λ i f i (x ) = 0, i = 1,, m m p f 0 (x ) + λ i f i (x ) + νi h i (x ) = 0 i=1 i=1 Remark: λ i f i(x ) = 0 for each i are called the complementary slackness conditions.

The Karush-Kuhn-Tucker (KKT) Conditions Sufficient Conditions Assume the primal problem is convex with h(x) = Ax b and f 0,, f m differentiable. Assume x, ( λ, ν) satisfy the KKT conditions: f 0 ( x) + m λ i f i ( x) + A T ν = 0 i=1 f i ( x) 0, i = 1,, m A x = b, i = 1,, p λ i 0, i = 1,, m λ i f i ( x) = 0, i = 1,, m Then x and ( λ, ν) are primal and dual optimal with zero duality gap.

The Karush-Kuhn-Tucker (KKT) Conditions Sufficient Conditions Assume the primal problem is convex with h(x) = Ax b and f 0,, f m differentiable. Assume x, ( λ, ν) satisfy the KKT conditions: f 0 ( x) + m λ i f i ( x) + A T ν = 0 i=1 f i ( x) 0, i = 1,, m A x = b, i = 1,, p λ i 0, i = 1,, m λ i f i ( x) = 0, i = 1,, m Then x and ( λ, ν) are primal and dual optimal with zero duality gap. A primal-dual interior-point algorithm is an iterative algorithm that approaches a solution of the KKT system of conditions.

Primal-Dual Interior-Point Method Idea: Solve the nonlinear system r t (x, λ, ν) = 0 with r t (x, λ, ν) = f 0 (x) + Df (x) T λ + A T ν diag(λ)f (x) (1/t)1 Ax b, f (x) = f 1 (x). f m (x) and t > 0, using Newton s method (second order). The search direction is obtained by solving the linear system: 2 f 0 (x) + m i=1 λ i 2 f i (x) Df (x) T A T x diag(λ)df (x) diag(f (x)) 0 λ = r t (x, λ, ν). A 0 0 ν At each iteration t increases by the reciprocal of the (surogate) duality gap 1 f (x) T λ.

References B. Bollobás, Graph Theory. An Introductory Course, Springer-Verlag 1979. 99(25), 15879 15882 (2002). S. Boyd, L. Vandenberghe, Convex Optimization, available online at: http://stanford.edu/ boyd/cvxbook/ F. Chung, Spectral Graph Theory, AMS 1997. F. Chung, L. Lu, The average distances in random graphs with given expected degrees, Proc. Nat.Acad.Sci. 2002. F. Chung, L. Lu, V. Vu, The spectra of random graphs with Given Expected Degrees, Internet Math. 1(3), 257 275 (2004). R. Diestel, Graph Theory, 3rd Edition, Springer-Verlag 2005. P. Erdös, A. Rényi, On The Evolution of Random Graphs

G. Grimmett, Probability on Graphs. Random Processes on Graphs and Lattices, Cambridge Press 2010. C. Hoffman, M. Kahle, E. Paquette, Spectral Gap of Random Graphs and Applications to Random Topology, arxiv: 1201.0425 [math.co] 17 Sept. 2014. A. Javanmard, A. Montanari, Localization from Incomplete Noisy Distance Measurements, arxiv:1103.1417, Nov. 2012; also ISIT 2011. J. Leskovec, J. Kleinberg, C. Faloutsos, Graph Evolution: Densification and Shrinking Diameters, ACM Trans. on Knowl.Disc.Data, 1(1) 2007.