Lagrangian Duality and Convex Optimization

Similar documents
Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

On the Method of Lagrange Multipliers

Convex Optimization & Lagrange Duality

CS-E4830 Kernel Methods in Machine Learning

Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities

Constrained Optimization and Lagrangian Duality

Convex Optimization Boyd & Vandenberghe. 5. Duality

Lagrange duality. The Lagrangian. We consider an optimization program of the form

The Lagrangian L : R d R m R r R is an (easier to optimize) lower bound on the original problem:

EE/AA 578, Univ of Washington, Fall Duality

5. Duality. Lagrangian

EE364a Review Session 5

CSCI : Optimization and Control of Networks. Review on Convex Optimization

Convex Optimization M2

A Brief Review on Convex Optimization

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

Lecture: Duality of LP, SOCP and SDP

ICS-E4030 Kernel Methods in Machine Learning

Lecture: Duality.

Optimization for Machine Learning

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

12. Interior-point methods

Lecture 7: Convex Optimizations

Duality. Geoff Gordon & Ryan Tibshirani Optimization /

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Introduction to Mathematical Programming IE406. Lecture 10. Dr. Ted Ralphs

Duality Uses and Correspondences. Ryan Tibshirani Convex Optimization

Primal/Dual Decomposition Methods

Applications of Linear Programming

Lagrangian Duality Theory

EE 227A: Convex Optimization and Applications October 14, 2008

subject to (x 2)(x 4) u,

The fundamental theorem of linear programming

Tutorial on Convex Optimization for Engineers Part II

Convex Functions. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Subgradients. subgradients. strong and weak subgradient calculus. optimality conditions via subgradients. directional derivatives

Subgradients. subgradients and quasigradients. subgradient calculus. optimality conditions via subgradients. directional derivatives

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

4TE3/6TE3. Algorithms for. Continuous Optimization

Lecture 6: Conic Optimization September 8

Lecture 14: Optimality Conditions for Conic Problems

Lagrangian Duality. Richard Lusby. Department of Management Engineering Technical University of Denmark

Lecture Note 5: Semidefinite Programming for Stability Analysis

Lecture 8 Plus properties, merit functions and gap functions. September 28, 2008

Convex Optimization and Modeling

Homework Set #6 - Solutions

Linear and Combinatorial Optimization

Lecture 18: Optimization Programming

Convex Optimization and SVM

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

Tutorial on Convex Optimization: Part II

Subgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Support Vector Machines: Maximum Margin Classifiers

Lecture 7: Weak Duality

Constrained optimization

Part IB Optimisation

Interior Point Algorithms for Constrained Convex Optimization

8. Geometric problems

12. Interior-point methods

4. Algebra and Duality

ECE Optimization for wireless networks Final. minimize f o (x) s.t. Ax = b,

Lagrange Relaxation and Duality

Subgradient Descent. David S. Rosenberg. New York University. February 7, 2018

Lecture 4: Optimization. Maximizing a function of a single variable

Lecture 3 January 28

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Gauge optimization and duality

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Sparse Optimization Lecture: Dual Certificate in l 1 Minimization

Lecture 9 Sequential unconstrained minimization

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009

A : k n. Usually k > n otherwise easily the minimum is zero. Analytical solution:

Convex Optimization Overview (cnt d)

Math 273a: Optimization Subgradients of convex functions

Maximum likelihood estimation

Nonlinear Programming

A : k n. Usually k > n otherwise easily the minimum is zero. Analytical solution:

Summer School: Semidefinite Optimization

8. Geometric problems

Lecture Notes on Support Vector Machine

EC9A0: Pre-sessional Advanced Mathematics Course. Lecture Notes: Unconstrained Optimisation By Pablo F. Beker 1

3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

4. Convex optimization problems (part 1: general)

Convex Optimization in Communications and Signal Processing

Lagrangian Duality for Dummies

Additional Homework Problems

AM 205: lecture 18. Last time: optimization methods Today: conditions for optimality

The Karush-Kuhn-Tucker (KKT) conditions

Lecture 8. Strong Duality Results. September 22, 2008

CONVEX OPTIMIZATION, DUALITY, AND THEIR APPLICATION TO SUPPORT VECTOR MACHINES. Contents 1. Introduction 1 2. Convex Sets

A Tutorial on Convex Optimization II: Duality and Interior Point Methods

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

Transcription:

Lagrangian Duality and Convex Optimization David Rosenberg New York University February 11, 2015 David Rosenberg (New York University) DS-GA 1003 February 11, 2015 1 / 24

Introduction Why Convex Optimization? Historically: Today: Linear programs (linear objectives & constraints) were the focus Nonlinear programs: some easy, some hard Main distinction is between convex and non-convex problems Convex problems are the ones we know how to solve efficiently Many techniques that are well understood for convex problems are applied to non-convex problems e.g. SGD is routinely applied to neural networks avid Rosenberg (New York University) DS-GA 1003 February 11, 2015 2 / 24

Introduction Your Reference for Convex Optimization Boyd and Vandenberghe (2004) Very clearly written, but has a ton of detail for a first pass. See my Extreme Abridgement of Boyd and Vandenberghe. David Rosenberg (New York University) DS-GA 1003 February 11, 2015 3 / 24

Introduction Notation from Boyd and Vandenberghe f : R p R q to mean that f maps from some subset of R p namely dom f R p, where dom f is the domain of f David Rosenberg (New York University) DS-GA 1003 February 11, 2015 4 / 24

Convex Sets and Functions Convex Sets Definition A set C is convex if for any x 1,x 2 C and any θ with 0 θ 1 we have θx 1 + (1 θ)x 2 C. KPM Fig. 7.4 avid Rosenberg (New York University) DS-GA 1003 February 11, 2015 5 / 24

Convex Sets and Functions Convex and Concave Functions Definition A function f : R n R is convex if dom f is a convex set and if for all x,y dom f, and 0 θ 1, we have f (θx + (1 θ)y) θf (x) + (1 θ)f (y). λ 1 λ x y A B KPM Fig. 7.5 avid Rosenberg (New York University) DS-GA 1003 February 11, 2015 6 / 24

Convex Sets and Functions Examples of Convex Functions on R Examples x ax + b is both convex and concave on Rfor all a,b R. x x p for p 1 is convex on R x e ax is convex on R for all a R avid Rosenberg (New York University) DS-GA 1003 February 11, 2015 7 / 24

Convex Sets and Functions Maximum of Convex Functions is Convex Theorem If f 1,...,f m : R n R are convex, then their pointwise maximum f (x) = max{f 1 (x),...,f m (x)} is also convex with domain dom f = dom f 1 dom f m. This result extends to sup over arbitrary [infinite] sets of functions. Proof. (For m = 2.) Fix an 0 θ 1 and x,y dom f. Then f (θx + (1 θ)y) = max{f 1 (θx + (1 θ)y),f 2 (θx + (1 θ)y)} max{θf 1 (x) + (1 θ)f 1 (y),θf 2 (x) + (1 θ)f 2 (y)} max{θf 1 (x),θf 2 (x)} + max{(1 θ)f 1 (y),(1 θ)f 2 (y)} = θf (x) + (1 θ)f (y) David Rosenberg (New York University) DS-GA 1003 February 11, 2015 8 / 24

Convex Sets and Functions Convex Functions and Optimization Definition A function f is strictly convex if the line segment connecting any two points on the graph of f lies strictly above the graph (excluding the endpoints). Consequences for optimization: convex: if there is a local minimum, then it is a global minimum strictly convex: if there is a local minimum, then it is the unique global minumum avid Rosenberg (New York University) DS-GA 1003 February 11, 2015 9 / 24

The General Optimization Problem General Optimization Problem: Standard Form General Optimization Problem: Standard Form minimize subject to f 0 (x) f i (x) 0, i = 1,...,m h i (x) = 0, i = 1,...p, where x R n are the optimization variables and f 0 is the objective function. Assume domain D = m i=0 dom f i p i=1 dom h i is nonempty. avid Rosenberg (New York University) DS-GA 1003 February 11, 2015 10 / 24

The General Optimization Problem General Optimization Problem: More Terminology The set of points satisfying the constraints is called the feasible set. A point x in the feasible set is called a feasible point. If x is feasible and f i (x) = 0, then we say the inequality constraint f i (x) 0 is active at x. The optimal value p of the problem is defined as p = inf {f 0 (x) f i (x) 0,i = 1,...,m, h i (x) = 0, i = 1,...,p}. x is an optimal point (or a solution to the problem) if x is feasible and f (x ) = p. avid Rosenberg (New York University) DS-GA 1003 February 11, 2015 11 / 24

Lagrangian Duality: Convexity not required The Lagrangian Recall the general optimization problem: minimize subject to f 0 (x) f i (x) 0, i = 1,...,m h i (x) = 0, i = 1,...p, Definition The Lagrangian for the general optimization problem is m p L(x,λ,ν) = f 0 (x) + λ i f i (x) + ν i h i (x), I =1 i=1 λ i s and ν s are called Lagrange multipliers λ and ν also called the dual variables. avid Rosenberg (New York University) DS-GA 1003 February 11, 2015 12 / 24

Lagrangian Duality: Convexity not required The Lagrangian Encodes the Objective and Constraints Supremum over Lagrangian gives back objective and constraints: ( ) m p sup L(x, λ, ν) = sup f 0 (x) + λ i f i (x) + ν i h i (x), λ 0,ν λ 0,ν i=1 i=1 { f 0 (x) f i (x) 0 and h i (x) = 0, all i = otherwise. Equivalent primal form of optimization problem: p = inf x sup λ 0,ν L(x,λ,ν) avid Rosenberg (New York University) DS-GA 1003 February 11, 2015 13 / 24

Lagrangian Duality: Convexity not required The Primal and the Dual Original optimization problem in primal form: p = inf x The Lagrangian dual problem: sup λ 0,ν L(x,λ,ν) d = sup inf L(x,λ,ν) λ 0,ν x We will show weak duality: p d for any optimization problem avid Rosenberg (New York University) DS-GA 1003 February 11, 2015 14 / 24

Lagrangian Duality: Convexity not required Weak Max-Min Inequality Theorem For any f : R n R m R, W R n, or Z R m, we have sup inf z Z w W f (w,z) inf sup w W z Z Proof. For any w 0 W and z 0 Z, we clearly have f (w,z). inf f (w,z 0) f (w 0,z 0 ) sup f (w 0,z). w W z Z Since this is true for all w 0 and z 0, we must also have sup inf f (w,z 0) inf sup f (w 0,z). w W w 0 W z 0 Z z Z avid Rosenberg (New York University) DS-GA 1003 February 11, 2015 15 / 24

Lagrangian Duality: Convexity not required Weak Duality For any optimization problem (not just convex), weak max-min inequality implies weak duality: p =inf x sup λ 0,ν sup inf λ 0,ν x [ [ f 0 (x) + f 0 (x) + m λ i f i (x) + I =1 m λ i f i (x) + I =1 ] p ν i h i (x) i=1 ] p ν i h i (x) = d i=1 The difference p d is called the duality gap. For convex problems, we often have strong duality: p = d. avid Rosenberg (New York University) DS-GA 1003 February 11, 2015 16 / 24

Lagrangian Duality: Convexity not required The Lagrange Dual Function The Lagrangian dual problem: d = sup inf L(x,λ,ν) λ 0,ν x }{{} Lagrange dual function Definition The Lagrange dual function (or just dual function) is ( m g(λ,ν) = inf L(x,λ,ν) = inf f 0 (x) + λ i f i (x) + x D x D i=1 ) p ν i h i (x). The dual function may take on the value (e.g. f 0 (x) = x). i=1 avid Rosenberg (New York University) DS-GA 1003 February 11, 2015 17 / 24

Lagrangian Duality: Convexity not required The Lagrange Dual Problem In terms of Lagrange dual function, we can write weak duality as p sup g(λ,ν) = d λ 0,ν So for any (λ,ν) with λ 0, Lagrange dual function gives a lower bound on optimal solution: g(λ,ν) p avid Rosenberg (New York University) DS-GA 1003 February 11, 2015 18 / 24

Lagrangian Duality: Convexity not required The Lagrange Dual Problem The Lagrange dual problem is a search for best lower bound: maximize g(λ, ν) subject to λ 0. (λ,ν) dual feasible if λ 0 and g(λ,ν) >. (λ,ν ) are dual optimal or optimal Lagrange multipliers if they are optimal for the Lagrange dual problem. Lagrange dual problem often easier to solve (simpler constraints). d can be used as stopping criterion for primal optimization. Dual can reveal hidden structure in the solution. avid Rosenberg (New York University) DS-GA 1003 February 11, 2015 19 / 24

Convex Optimization Convex Optimization Problem: Standard Form Convex Optimization Problem: Standard Form minimize subject to f 0 (x) f i (x) 0, i = 1,...,m a T i x = b i, i = 1,...p where f 0,...,f m are convex functions. Note: Equality constraints are now linear. Why? avid Rosenberg (New York University) DS-GA 1003 February 11, 2015 20 / 24

Convex Optimization Strong Duality for Convex Problems For a convex optimization problems, we usually have strong duality, but not always. For example: minimize e x subject to x 2 /y 0 y > 0 The additional conditions needed are called constraint qualifications. Example from Laurent El Ghaoui s EE 227A: Lecture 8 Notes, Feb 9, 2012 avid Rosenberg (New York University) DS-GA 1003 February 11, 2015 21 / 24

Convex Optimization Slater s Constraint Qualifications for Strong Duality Sufficient conditions for strong duality in a convex problem. Roughly: the problem must be strictly feasible. Qualifications when problem domain D R n is an open set: x such that Ax = b and f i (x) < 0 for i = 1,...,m For any affine inequality constraints, f i (x) 0 is sufficient Otherwise, x must be in the relative interior of D See notes, or BV Section 5.2.3, p. 226. avid Rosenberg (New York University) DS-GA 1003 February 11, 2015 22 / 24

Complementary Slackness Complementary Slackness Consider a general optimization problem (i.e. not necessarily convex). If we have strong duality, we get an interesting relationship between the optimal Lagrange multiplier λ i and the ith constraint at the optimum: f i (x ) Relationship is called complementary slackness : λ i f i (x ) = 0 Lagrange multiplier is zero unless the constraint is active at the optimum. avid Rosenberg (New York University) DS-GA 1003 February 11, 2015 23 / 24

Complementary Slackness Complementary Slackness Proof Assume strong duality: p = d in a general optimization problem Let x be primal optimal and (λ,ν ) be dual optimal. Then: f 0 (x ) = g(λ,ν ) ( = inf x f 0 (x ) + f 0 (x ). f 0 (x) + m λ i f i (x) + I =1 m λ i f i (x ) + }{{} 0 i=1 ) p ν i h i (x) i=1 p ν i h i (x ) }{{} =0 i=1 Each term in sum i=1 λ i f i(x ) must actually be 0. That is λ i f i (x ) = 0, i = 1,...,m. This condition is known as complementary slackness. David Rosenberg (New York University) DS-GA 1003 February 11, 2015 24 / 24