Duality (Continued) min f ( x), X R R. Recall, the general primal problem is. The Lagrangian is a function. defined by

Similar documents
Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers

Lagrange Relaxation and Duality

Introduction to Optimization Techniques. Nonlinear Programming

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Primal-dual Subgradient Method for Convex Problems with Functional Constraints

CS Lecture 8 & 9. Lagrange Multipliers & Varitional Bounds

Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

Additional Homework Problems

Practice Exam 1: Continuous Optimisation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

Constrained Optimization and Lagrangian Duality

CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS

Lecture 3 January 28

Linear Programming. Larry Blume Cornell University, IHS Vienna and SFI. Summer 2016

Finite Dimensional Optimization Part III: Convex Optimization 1

subject to (x 2)(x 4) u,

LP Duality: outline. Duality theory for Linear Programming. alternatives. optimization I Idea: polyhedra

Subgradients. subgradients. strong and weak subgradient calculus. optimality conditions via subgradients. directional derivatives

HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.

1. f(β) 0 (that is, β is a feasible point for the constraints)

Lagrangian Duality and Convex Optimization

Introduction to Mathematical Programming IE406. Lecture 10. Dr. Ted Ralphs

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

On duality theory of conic linear problems

Primal/Dual Decomposition Methods

EE364b Homework 2. λ i f i ( x) = 0, i=1

Math 273a: Optimization Convex Conjugacy

Convex analysis and profit/cost/support functions

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

EE364a Review Session 5

Convex Optimization and Modeling

3.10 Lagrangian relaxation

Subgradients. subgradients and quasigradients. subgradient calculus. optimality conditions via subgradients. directional derivatives

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

DUALITY, OPTIMALITY CONDITIONS AND PERTURBATION ANALYSIS

Optimality Conditions for Constrained Optimization

Maximal Monotone Inclusions and Fitzpatrick Functions

Lagrangian Duality. Richard Lusby. Department of Management Engineering Technical University of Denmark

Convex Optimization Boyd & Vandenberghe. 5. Duality

Convex Optimization & Lagrange Duality

Symmetric and Asymmetric Duality

EE 227A: Convex Optimization and Applications October 14, 2008

Convex Optimization. (EE227A: UC Berkeley) Lecture 28. Suvrit Sra. (Algebra + Optimization) 02 May, 2013

Lecture: Duality.

Lecture 1: Background on Convex Analysis

Generalization to inequality constrained problem. Maximize

w-(h,ω) CONJUGATE DUALITY THEORY IN MULTIOBJECTIVE NONLINEAR OPTIMIZATION 1

Enhanced Fritz John Optimality Conditions and Sensitivity Analysis

Subgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus

Lagrange Multipliers

5. Duality. Lagrangian

Duality Theory of Constrained Optimization

A notion of Total Dual Integrality for Convex, Semidefinite and Extended Formulations

Solving Dual Problems

Franco Giannessi, Giandomenico Mastroeni. Institute of Mathematics University of Verona, Verona, Italy

IE 521 Convex Optimization Homework #1 Solution

CHAPTER 2: CONVEX SETS AND CONCAVE FUNCTIONS. W. Erwin Diewert January 31, 2008.

Lecture: Duality of LP, SOCP and SDP

Lecture 6: Conic Optimization September 8

BASICS OF CONVEX ANALYSIS

Lecture 25: Subgradient Method and Bundle Methods April 24

Lecture 8 Plus properties, merit functions and gap functions. September 28, 2008

4. Algebra and Duality

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3

Optimization and Optimal Control in Banach Spaces

Lecture 2: Linear SVM in the Dual

Subgradient Descent. David S. Rosenberg. New York University. February 7, 2018

A convergence result for an Outer Approximation Scheme

JENSEN S OPERATOR INEQUALITY AND ITS CONVERSES

Linear and non-linear programming

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Duality. for The New Palgrave Dictionary of Economics, 2nd ed. Lawrence E. Blume

Lagrange Relaxation: Introduction and Applications

Lecture 8. Strong Duality Results. September 22, 2008

106 CHAPTER 3. TOPOLOGY OF THE REAL LINE. 2. The set of limit points of a set S is denoted L (S)

Research Article Sufficient Optimality and Sensitivity Analysis of a Parameterized Min-Max Programming

Convex Optimization M2

Example: feasibility. Interpretation as formal proof. Example: linear inequalities and Farkas lemma

ICS-E4030 Kernel Methods in Machine Learning

Lagrangian Duality Theory

Iteration-complexity of first-order penalty methods for convex programming

Lagrangian Duality. Evelien van der Hurk. DTU Management Engineering

A SET OF LECTURE NOTES ON CONVEX OPTIMIZATION WITH SOME APPLICATIONS TO PROBABILITY THEORY INCOMPLETE DRAFT. MAY 06

Optimization Theory. Lectures 4-6

INVEX FUNCTIONS AND CONSTRAINED LOCAL MINIMA

4TE3/6TE3. Algorithms for. Continuous Optimization

Lecture 2: Convex Sets and Functions

Victoria Martín-Márquez

Robust Duality in Parametric Convex Optimization

Primal Solutions and Rate Analysis for Subgradient Methods

Support Vector Machines for Regression

A New Fenchel Dual Problem in Vector Optimization

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

Transcription:

Duality (Continued) Recall, the general primal problem is min f ( x), xx g( x) 0 n m where X R, f : X R, g : XR ( X). he Lagrangian is a function L: XR R m defined by L( xλ, ) f ( x) λ g( x)

Duality (Continued) and the dual function, L : R m R { } is defined by L ( λ) min L( x, λ) xx (where it is understood that max sup and min inf ) heorem 3: he dual function is concave (regardless of the nature of the general primal problem). L 2

Duality (Continued) 2 2 Proof: Let λ, λ 0 be such that L ( λ ) and L ( λ ). 2 Let [0,] and consider λ ( ) λ. L 2 2 λ ( ) λ min f ( x) λ ( ) λ g( x) xx 2 min f( ) ( ) ( ) f( ) ( ) x λ g x x λ g x x X 2 min f( x) λ g( x) ( ) min f( x) λ g( x) xx L 2 ( λ ) ( ) L( λ ) xx 3

Duality (Continued) 2 If, on the other hand, L ( λ ) or L ( λ ) then, automatically, the concave inequality holds. L L L 2 2 ( λ ( ) λ ) ( λ ) ( ) ( λ ) Alternative Proof: For a fixed x X we see that f ( x) λ g( x) is linear in λ 0 and is therefore concave in λ (for this fixed x ). herefore, G λ λr R x λ x x m (, ),, f ( ) g( ) 4

Duality (Continued) is a convex set. But G λ λr R λ L m (, ),, L ( ) m ( λ, ) λ R, R, min L( x, λ) xx λ λr R f x λ g x all xx m (, ),, ( ) ( ) f g xx xx G m ( λ, ) λ R, R, ( x) λ ( x) x But the intersection of any collection of convex sets is also convex; therefore G L is convex L is concave in λ 0. 5

Duality (Continued) We now introduce the primal function which may also be called the optimal value function or the perturbation function. Definition: he set of feasible right sides is defined to be m Y yr xx g( x) y X Y : YR (note: ). he primal function is defined as ( y) min f ( x), xx g( x) y. 6

Duality (Continued) heorem 4: If X is convex and g is a vector of concave functions then Y is a convex set. If, in addition, f is convex on X then is a convex function on Y. 2 2 Proof: Let y Y, y Y. hen there exist x X, x X 2 2 so that g( x ) y and g( x ) y, and then g( x ) y 2 2 and ( ) g ( x ) ( ) y for [0,]. By concavity of the vector g we have g( x ( ) x ) g( x ) ( ) g( x ) y (y 2 2 2 2 y ( y Y, since 2 x x X (. 7

Duality (Continued) 2 Now, also assume f is convex on X. Let y Y, y Y be 2 such that ( y ) and ( y ). hen, by the definition of inf, for all 0 there exist x 2, x X so that g( x ) y g( x ) y 2 2 i i and f( x ) ( y ), i, 2. Also, by concavity of g, x ( ) x F xx g( x) y ( ) y 2 2 8

Duality (Continued) and, therefore, y y x x 2 2 ( ( ) ) f ( ( ) ) f x herefore, by letting 0 we get 2 ( ) ( ) f( ) y y x 2 ( ) ( ) ( ) 2 2 ( y ( ) y ) ( y ) ( ) ( y ) HW 46: Complete the proof for the case where 2 or ( y ). ( y ) 9

Let f : R R be defined by f( x) = x f ( x) x hen f is a convex function which is not differentiable at x = 0. However, note that at x = 0 we have x x ( xx) or x x 0

and this holds for all [,]. Such a function is said to be subdifferentiable at x. Definition: Let X R n be convex and let f : X R. f is said to be subdifferentiable at x X (in convex sense) if there is a vector R n so that x f( x) f( x) ( xx) for all x X. he vector x is called a subgradient or a support at x. x

[Note: f is subdifferentiable in concave sense at x if f is subdifferentiable in convex sense at x.] We say f is subdifferentiable on X (in either sense) if it is so at each x X. heorem 5: Let X be convex and let f : X R be subdifferentiable (in convex sense) (in concave sense). hen f is (convex on X ) (concave on X ). 2

2 Proof: Let x, x X and let x x ( ) x 2 for [0,]. hen so that R n x Subdifferentiability f f ( x ) ( x ) ( x x ) x f f 2 2 ( x ) ( x ) ( x x ) x f (x ) + ( ) f (x f (x ) x x x x 2-2 ) ( ) ( ) ( ) x x f (x ) // 3

HW 47 : Let X be convex and let f : X R be convex. Does this imply f is subdifferentiable, in convex sense, on X? HW 48 : Show that if f is differentiable at x and if x is a support at x then f ( x) x. [Hint : Simply use the notion of directional derivative.] heorem 6: Consider the general primal problem min f ( x) g( x) 0. Let λ 0 and let x X solve min L( xλ, ). hen g( x) is a support (concave sense) for at λ. L xx xx 4

Proof: o show L() λ L()- λ g( x) ( λ λ) for all λ 0. Now, L ( λ) inf ( f( x) λ g( x)) f( x) λ g( x) xx f ( x) λ g( x) λ g( x) λ g( x) L ( λ) g( x) ( λλ) [Note: Because of HW 48, if L is differentiable at λ then L ( λ) g( x).] he following is an often useful result. 5

heorem 7: Consider the general problem GP: min f ( x), xx g( x) 0 Let λ 0 and let x solve min L( xλ, ). hen x solves xx min f ( x), xx g( x) g( x) ( ) P g ( x ) 6

Proof : Let y g( x). he Lagrangian for this perturbed problem is L(, xλ ) f () x λ y ( g () x g ()) x f ( x) λ g( x) λ g( x)) L( xλ, ) λ g( x) L ( λ) min L ( x, λ) min L( x, λ) λ g( x) L( x, λ) λ g( x) y xx y xx 7

herefore, (i) x solves Subdifferentiability min L ( xλ, ) xx (ii) x X, g( x) g( x) y (iii) λ ( g( x) g( x)) 0 ( xλ, ) is a saddle-point for the Lagrangian L(, xλ ) P g ( x ) (which is the Lagrangian for.) HW 49 : Corollary: x is optimal for min f ( x), xx g( x) y where y g ( x) if 0, and y g ( x) if 0. i i i i i i y 8

We now want to demonstrate the result that if x solves min L( xλ, ) then λ is a support (convex sense) for at g( x). hat is, we want to show ( y) ( g( x)) λ ( yg( x)), for all y Y. herefore, if is differentiable at g( x) then xx ( g( x)) λ he most convenient way to lead up to this important result is through the following which relates the primal function,, with the dual function,. L 9

heorem 8: L( λ)= inf ( ( y) λ y). yy Proof: inf ( ( ) L( λ)= f x λ g( x)) inf ( f( x) λ g( x)) xx all yy { xx g( x) y } herefore, L( λ) inf ( ( y) λ y). yy inf ( f ( x) λ y) ( y) λ y { xx g( x) y } It remains to show the opposite inequality. Define Y { y R m xx g ( x ) y }. hen Y Y. Now, let x X and let y g ( x) ( yy ). 20

hen, f( x) λ g( x) inf ( f( x) λ g( x)) inf ( f( x) λ y) { xx g( x) y } { xx g( x) y} inf ( f ( x) λ y) ( y) λ y { xx g( x) y } herefore, since x X is arbitrary, we have L ( λ) min ( f ( x) λ g( x)) inf ( ( y) λ y) inf ( ( y) λ y) xx yy yy We can now derive the result dual to that of heorem 6. heorem 9: Let λ 0 and let x X solve min L( xλ, ) xx hen λ is a support (convex sense) for at g( x). 2

Proof: o show ( y) ( g( x)) λ ( y g( x)), for all y Y, by heorem 7, we know that ( xλ, ) is a saddle-point for min f ( x) g( x) g( x) xx and, therefore, ( g( x )) ( λ ) ( λ ) λ g( x ) inf ( ( y ) λ y ) λ g( x ) L y L yg( x) yy ( y) λ g( x) λ y all yy ( y) ( g( x)) λ ( y g ( x)), for all y Y Note: If is differentiable at g( x), then ( g( x)) λ 22

Let s define the sets of supports for and L, respectively, as and m ( z) { λr ( y) ( z) λ ( y z), forall y Y } m L( γ) { yr L( λ) L( γ) y ( λγ), forall λ 0 } In terms of this notation, heorems 6 and 9 can be summarized as follows. heorem 20: Let λ 0 and let x X solve min L( xλ, ). hen, and g( x) L ( λ) λ ( g( x)) xx 23

Moreover is differentiable at and is differentiable at g x, L λ ( ) g( x) L ( λ) and λ ( g( x)) 2 Example : min x, x 0. xr Recall, this problem has no saddle-point. Also, we showed L ( L ( 24

that max L( ) has no optimizing vector. Also, note that 0 and, for y Y, we have ( y) y Y { y y 0} y) y y) y And this function has no convex support at the origin ( y 0). 25

Example 2 : min x, x 0. xr Subdifferentiability Recall, this problem has no saddle-point. We also showed L ( ) 0, if 0, if 0 herefore, 0 solves max L ( ). Also, note that Y R and 0 ( y) y, if y 0, if y 26

y) y Note that has no convex support at the origin ( y 0). Also, note that L( x, ) x and therefore only x 0 solves min L( x, ) and, of course, is not even feasible. Also, note that xr max L( x, ) xr x 0 (so this is no help either). 27

Example 3 : min x, x 0 where xx hen L( x, ) ( ) x and x R, if 0 L ( ), if 28

* * * herefore, solves max L ( ). Also, note that ( x, ) is a saddle-point, 0 * where x (show this!). However, L( x, ) for all xr * * * and, therefore, x is not the only optimizer for the Lagrangian (parameterized by the optimal dual vector). In particular, the operation * min L( x, ) xr does not automatically provide an optimal primal solution. Note further * that the dual function,, is not differentiable at. L hese examples lead to the following. 29

heorem 2: Consider the general primal problem min f ( x), xx g( x) 0 (GP) and assume GP has an optimal vector x. hen, GP has a saddle point if, and only if, the primal function has a nonnegative support (convex sense) at the origin ( y 0). Proof: Suppose ( x, λ ) is a saddle-point. o show λ (0) or to show ( y) (0) λ y, for all yy 30

Now, (0) f ( x) L ( λ) inf ( ( y) λ y) yy ( y) λ y, all yy or ( y) (0) λ y, all yy Conversely, suppose λ (0), λ 0. hen, ( y) (0) λ y, all yy L ( λ) inf ( ( y) λ y) (0) yy 3

But the Weak Duality heorem states that L ( λ) and, therefore, L ( λ). herefore, λ solves the dual problem and f ( x) L ( λ) implies that ( xλ, ) is a saddle-point. heorem 22: Let λ 0 solve max L ( λ and further assume that L is λ0 differentiable at λ. hen any x * X which solves min L( xλ, ) is also optimal for the general primal GP. xx 32

Proof: Since λ is optimal we must have L ( λ) ( λλ) 0, all λ 0 (since { λλ λ 0 } is the set of feasible directions at λ ). By HW48 * and heorem 6 we have L ( λ) g( x ) and therefore * gx ( ) ( λ λ) 0, all λ0 or * * inf λ ( x ) λ ( x ) λ0 g g (*) * herefore, g( x ) 0 since if, say, g ( x * ) 0 * then inf λ g( x ). * * herefore, λ g( x ) 0. But, by setting λ=0 in (*), we have λ g( x ) 0. * Hence λ g( x ) 0. herefore, λ0 33

and solves * (i) x X min L( x, λ) xx * * (ii), g( ) x X x 0 * (iii) λ g( x ) 0 * ( x, λ) * is a saddle-point for GP which, in turn, implies x is optimal for GP. HW 50 Consider the problem min xr x x 2 x 0 0 34

[ Note: his problem is equivalent to Example.] Show whether this problem has a saddle-point. HW 5: Consider GP and assume there is a vector x X so that (i.e., g( x) 0, g2( x) 0,, g m ( x) 0). Show that the set of " λ " components of saddle-points (if any) is bounded. hat is, show that the set { λ 0 xx ( x, λ is a saddle -point } is a bounded set [Hint: Use the definition of the saddle point.] Is the set for HW 50 bounded? [Note: he proof of heorem 22 is not entirely rigorous since λ may have some zero components (i.e., λ may be on boundary of { λ λ 0} and we have not said what we mean by L being differentiable at a boundary point.] 35