Linear and non-linear programming

Similar documents
HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

5. Duality. Lagrangian

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

Lecture: Duality of LP, SOCP and SDP

Part IB Optimisation

Convex Optimization Boyd & Vandenberghe. 5. Duality

minimize x x2 2 x 1x 2 x 1 subject to x 1 +2x 2 u 1 x 1 4x 2 u 2, 5x 1 +76x 2 1,

Convex Optimization & Lagrange Duality

8. Conjugate functions

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

Lecture: Duality.

Convex Optimization M2

CSCI : Optimization and Control of Networks. Review on Convex Optimization

Constrained Optimization and Lagrangian Duality

Lecture 2: Convex Sets and Functions

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers

Lecture 1: January 12

A Brief Review on Convex Optimization

Lecture Note 5: Semidefinite Programming for Stability Analysis

Optimality, Duality, Complementarity for Constrained Optimization

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

CSC Linear Programming and Combinatorial Optimization Lecture 10: Semidefinite Programming

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

4. Algebra and Duality

Convex Optimization and Support Vector Machine

4. Convex optimization problems

Convex Optimization M2

Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities

EE/AA 578, Univ of Washington, Fall Duality

Lecture 8. Strong Duality Results. September 22, 2008

Semidefinite Programming

Lecture 18: Optimization Programming

Support Vector Machines

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016

Semidefinite Programming Basics and Applications

4TE3/6TE3. Algorithms for. Continuous Optimization

Optimization for Machine Learning

Assignment 1: From the Definition of Convexity to Helley Theorem

3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

Lecture 1: Background on Convex Analysis

Introduction to Mathematical Programming IE406. Lecture 10. Dr. Ted Ralphs

Lecture 6: Conic Optimization September 8

Linear Programming. Larry Blume Cornell University, IHS Vienna and SFI. Summer 2016

16.1 L.P. Duality Applied to the Minimax Theorem

Lecture: Convex Optimization Problems

Lagrangian Duality Theory

Linear and Combinatorial Optimization

164 Final Solutions 1

subject to (x 2)(x 4) u,

Convex Optimization. Ofer Meshi. Lecture 6: Lower Bounds Constrained Optimization

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

Convex Functions. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Duality. Geoff Gordon & Ryan Tibshirani Optimization /

CS-E4830 Kernel Methods in Machine Learning

Nonlinear Programming Models

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

minimize x subject to (x 2)(x 4) u,

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

LECTURE 10 LECTURE OUTLINE

Summer School: Semidefinite Optimization

Geometric problems. Chapter Projection on a set. The distance of a point x 0 R n to a closed set C R n, in the norm, is defined as

Additional Homework Problems

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

Lecture 7: Convex Optimizations

Semidefinite Programming

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Interior Point Methods: Second-Order Cone Programming and Semidefinite Programming

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

Convex Optimization and Modeling

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

12. Interior-point methods

Lecture 3: Semidefinite Programming

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

13. Convex programming

Semidefinite Programming

Math 164-1: Optimization Instructor: Alpár R. Mészáros

ICS-E4030 Kernel Methods in Machine Learning

15. Conic optimization

12. Interior-point methods

Lecture: Examples of LP, SOCP and SDP

Homework Set #6 - Solutions

4. Convex optimization problems

N. L. P. NONLINEAR PROGRAMMING (NLP) deals with optimization models with at least one nonlinear function. NLP. Optimization. Models of following form:

4. Convex optimization problems (part 1: general)

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009

Introduction to Semidefinite Programming I: Basic properties a

A Review of Linear Programming

EC /11. Math for Microeconomics September Course, Part II Lecture Notes. Course Outline

Solution to EE 617 Mid-Term Exam, Fall November 2, 2017

Applications of Linear Programming

EE 227A: Convex Optimization and Applications October 14, 2008

A function(al) f is convex if dom f is a convex set, and. f(θx + (1 θ)y) < θf(x) + (1 θ)f(y) f(x) = x 3

Convex optimization problems. Optimization problem in standard form

Lecture 3. Optimization Problems and Iterative Algorithms

Introduction to Nonlinear Stochastic Programming

Transcription:

Linear and non-linear programming Benjamin Recht March 11, 2005

The Gameplan Constrained Optimization Convexity Duality Applications/Taxonomy 1

Constrained Optimization minimize f(x) subject to g j (x) 0 j = 1,..., J h k (x) = 0 k = 1,..., K x Ω R n Exercise: formulate the halting problem as an optimization 2

Equivalence of Feasibility and Optimization From a complexity point of view minimize f(x) subject to g j (x) 0 h k (x) = 0 x Ω R n find x and t subject to f(x) t 0 g j (x) 0 h k (x) = 0 x Ω R n If you solve the RHS, you get a solution for the LHS. If you do bisection on t for the LHS, you solve the RHS. 3

Convexity: Overview Phrasing a problem as an optimization generally buys you nothing However, solving a Convex Program is generically no harder than least squares. The hard part is formulating the problem. 4

Convex Sets If x 1,..., x n Ω, a convex combination is a linear combination N i=1 p i x i where p i > 0 and N i=1 p i = 1 The line segment between x and y is given by (1 t)x + ty. This is a convex combination of two points. A set Ω R n is convex if it contains all line segments between all points. That is, x, y Ω implies (1 t)x + ty Ω for all t. 5

Examples of Convex Sets R n is convex. Any vector space is convex. Any line segment is convex. Any line is convex. The set of psd matrices is convex. Q 0 and P 0 implies tq + (1 t)p 0. 6

Examples of Non-convex Sets The integers are not convex. The set of bit strings of length n is not convex. The set of vectors with norm 1 is not convex. The set of in- The set of singular matrices is not convex. vertible matrices is not convex. 7

Operations that preserve convexity If Ω 1,..., Ω m are convex, then m i=1 Ω i is convex. If Ω 1 is convex. Then Ω 2 = {Ax + b x Ω 1 } is convex. If Ω 1 is convex. Then Ω 2 = {x Ax + b Ω 1 } is convex. 8

Convex Functions A function f : Ω R n is convex if the set is convex epi(f) = {(x, f(x)) x Ω} For functions f : R n R m, f is convex iff for all x, y R n f((1 t)x + ty) (1 t)f(x) + tf(y) 9

Checking Convexity with Derivatives f : R n R If f is differentiable f is convex iff f(y) f(x)+ f(x) (y x) for all y If f is twice differentiable f is convex iff 2 f is positive semidefinite. These facts will be useful next week when we discuss optimization algorithms 10

Operations that preserve convexity If f(x) is convex then f(ax + b) is convex. If f 1,..., f n are convex, then so is a 1 f 1 + + a n f n for any scalars a i. If f 1,..., f n are convex, then max i f i (x) is convex. If for all y, f(x, y) is convex in x, then sup y f(x, y) is convex 11

Examples of Convex Functions Any affine function f(x) = Ax + b is convex. log(x) is convex. exp(x) is convex. x 2 is convex. A quadratic form x Qx with Q = Q is convex if and only if Q 0. 12

Quadratic Forms A quadratic form Q = Q is convex if and only if Q 0. Proof Q 0 implies Q = A A for some A. Then x Qx = x A Ax = Ax 2 Conversely, if Q is not psd, let v be a norm 1 eigenvector corresponding to eigenvalue λ < 0. Then 0 = ( v + v) Q( v + v) > ( v) Q( v) + (v) Q(v) = 2λ 13

Examples of Non-Convex Functions sin(x), cos(x), and tan(x) are not convex. x 3 is not convex Gaussians p(x) = exp( x Λ 1 x/2) are not convex. However, log(p) is convex! 14

Examples of Convex Constraint Sets Ω 1 = {x g(x) 0} is convex if g is convex. Proof Let x, y Ω 1, 0 t 1. If f is convex, proving tx + (1 t)y Ω. f(tx + (1 t)y) tf(x) + (1 t)y 0 Ω 2 = {x h(x) = 0} is convex if h(x) = Ax + b. Proof Let x, y Ω 2, 0 t 1. If h is affine, h(tx + (1 t)y) = A(tx + (1 t)y) + b proving tx + (1 t)y Ω. = t(ax + b) + (1 t)(ay + b) = 0 15

The Hahn-Banach Theorem A half- A hyperplane is a set of the form {a x = b} R n. space is a set of the form {a x b} R n Theorem If Ω is convex and x cl(ω) then there exists a hyperplane separating x and Ω. It follows that Ω is the intersection of all half-spaces which contain it. 16

Duality minimize f(x) subject to g j (x) 0 x Ω The Lagrangian for this problem is given by L(x, µ) = f(x) + J j=1 µ j g j (x) with µ 0. The µ j and are called Lagrange multipliers. In calculus, we searched for values of µ by using x L(x, µ) = 0. Here, note that solving the optimization is equivalent to solving min x max µ 0 L(x, µ) 17

Duality (2) min x max µ 0 L(x, µ) max min L(x, µ) µ 0 x The right hand side is called the Dual Program Proof Let f(x, y) be any function with two arguments. Then f(x, y) min x f(x, y). Taking the max w.r.t. y of both sides shows max y f(x, y) max y min x f(x, y). Now take the min of the right hand side w.r.t. x to prove the theorem. 18

Duality (3) The dual program is always concave. the dual function To see this, consider q(µ) min x L(x, µ) = min x f(x) + J j=1 µ j g j (x) Now, since min x (f(x) + g(x)) (min x f(x)) + (min x g(x)), we have q(tµ 1 + (1 t)µ 2 ) = min x t f(x) + J j=1 + (1 t) f(x) + tq(µ 1 ) + (1 t)q(µ 2 ) µ 1j g j (x) J j=1 µ 2j g j (x) 19

Duality (4) The dual may be interpreted as searching over half spaces which contain the set {(f(x), g(x)) R J+1 x Ω}. This is illustrated in the figures. When the problem is convex and strictly feasible, the dual of the dual returns the primal. 20

Duality Gaps We know that the solution to the primal problem is greater than or equal to the solution of the dual problem. The duality gap is defined to be min max L(x, µ) max min L(x, µ) x Ω µ 0 µ 0 x Ω When f and g j are convex functions, Ω is a convex set, and there is a point strictly inside Ω with g j (x) < 0 for all j then the duality gap is zero. Otherwise, estimating the duality gap is quite hard. In many cases, this gap is infinite. Later classes will examine how to analyze when the gap is small. 21

Linear Programming minimize subject to c x Ax b x 0 Sometimes you will have equality constraints as well. Sometimes you won t have x 0. 22

Equivalence of Representations To turn unsigned variables into nonnegative variables: x = x + x x ± 0 To turn equality constraints into inequalities: Ax = b Ax b and Ax b To turn inequalities into equalities Ax b Ax + s = b and s 0 Such s are called slack variables 23

Linear Programming Duality Set up the Lagrangian L(x, µ) = c x + µ (b Ax) = (c A µ) x + b µ Minimize with respect to x inf L(x, µ) = x 0 µ b c µ A 0 otherwise 24

Linear Programming Duality The dual program maximize b µ subject to A µ c µ 0 The dual of a linear program is a linear program. It has the same number of variables as the primal has constraints. It has the same number of constraints as the primal has variables. 25

Basic Feasible Solutions Consider the LP min s.t. c x Ax = b x 0 where A is m n and has m linearly independent columns. Let B be an m m matrix formed by picking m linearly independent columns from A basic solution of the LP is given by x j = [B 1 b] k j is the kth column of B 0 a j B If x is feasible, it is called a basic feasible solution (BFS). 26

The Simplex Algorithm FACT: If an optimal solution to an LP exists, then an optimal BFS exists. Simplex Algorithm (sketch): Find a BFS Find a column which improves the cost or break Swap this column in and find a new BFS Goto step 2 27

Chebyshev approximation min x max i=1,...,n a i x b i Is equivalent to the LP min t s.t. a i x b i t i = 1,..., N a i x + b i t i = 1,..., N 28

L 1 approximation min x N i=1 a i x b i Is equivalent to the LP N i=1 t i min s.t. a i x b i t i i = 1,..., N a i x + b i t i i = 1,..., N 29

Probability The set of probability distributions forms a convex set. example, the set of probabilities for N events is For N i=1 p i = 1 p i 0 The entropy is a concave function of a probability distribution H[p] N i=1 p i log p i 30

Maximum Entropy Distributions Let f be some random variable. Then the problem max p s.t. H[p] E p [f] = f is a convex program. This is the maximum entropy distribution with the desired expected values. Using the Lagrangian one can show and the dual is p i exp(λf i ) min λ log N i=1 exp(λf i ) λ f 31

Semidefinite Programming If A and B are symmetric n n matrices then Tr(AB) = n i,j=1 providing an inner product on matrices. A ij B ij A semidefinite program is a linear program over the positive semidefinite matrices. minimize Tr(A 0 Z) subject to Tr(A i Z) = c i k = 1,..., K Z 0 32

Semidefinite Programming Duality Set up the Lagrangian L(Z, µ) = Tr(A 0 Z) + = Tr A 0 + K K=1 K K=1 µ k (Tr(A k Z) c i ) µ k A k Z c µ Minimize with respect to Z inf L(Z, µ) = Z 0 c µ A 0 + K K=1 µ k A k 0 otherwise 33

Semidefinite Programming Duality The dual program min c µ s.t. A 0 + K K=1 µ k A k 0 We can put this back into the standard form by noting that the constraint set without the positivity condition is an affine set and hence can be written as an intersection of hyperplanes C = {W Tr(WG i ) = b i, i = 1,..., T } for some symmetric matrices G i and scalars b i. But it is important to recognize both forms as semidefinite programs. 34

Linear Programming as SDPs min s.t. c x Ax b x 0 Let a i denote the ith column of A. is equivalent to the SDP min c x s.t. diag(a i x b i) 0 35

Quadratic Programs as SDPs A quadratically constrained convex quadratic program is the optimization min f 0 (x) s.t. f i (x) 0 f i (x) = x A i x 2b i x + c i and A i 0 Let Q i Q = A i. This is equivalent to the semidefinite program min s.t. t [ [ 11 Q 0 x x Q 0 2b 0 x c 0 + t ] 11 Q i x x Q i 2b i x c 0 i ] 0 36

Logarithmic Chebyshev Approximation min x max i=1,...,n log(a i x) log(b i) Is equivalent to the SDP min s.t. t t a i x/b i 0 0 0 a i x/b i 1 0 1 t 0 i = 1,..., N 37

Finding the maximum singular value Let A(x) = A 0 + A 1 x 1 +... A k x k be an n m matrix valued function. Which value of x attains the matrix with the maximum singular value? Solve with an SDP min s.t. [ t11 A(x) t A(x) t11 ] 0 38

Problem 1: Examples of Convex Functions (Bertsekas Ex 1.5) Show that the following are convex on R n f 1 (x) = (x 1 x 2 x n ) 1/n on {x R n x i > 0} f 2 (x) = log N i=1 exp(x i ) f 3 (x) = x p with p 1. f 4 (x) = 1 f(x) where f is concave and positive for all x. 39

Problem 2: Zero-Sum Games (Bertsekas Ex 6.6) Let A be an n m matrix. Consider the zero sum game where player 1 picks a row of A and player 2 picks a column of A. Player 1 has the goal of picking as small an element as possible and Player 2 has the goal of picking as large an element as possible. This problem will use duality to prove that the optimal strategy is independent of who goes first. That is max z Z min x X x Az = min x X max z Z x Az where X = {x x i = 1x i 0} R n and Z = {z z i = 1z i 0} R m. 40

For a fixed z, show max z Z min x X x Az = max z Z min{[az] 1,..., [Az] n } = max t z Z,[Az] i t In a similar fashion, show min max x X z Z x Az = min u x X,[A x] i u Finally, show that the linear programs are dual to each other. max t and min u z Z,[Az] i t x X,[A x] i u 41

Problem 3: Duality Gaps Consider the non-convex quadratic program min x 2 + y 2 + z 2 + 2xy + 2yz + 2zx s.t. x 2 = y 2 = z 2 = 1 Show that the dual problem is a semidefinite program (Hint: write the program in matrix form in terms of quadratic forms.) show that the dual optimum is zero By trying cases, show that the minimum of the primal is equal to one. 42