# 4TE3/6TE3. Algorithms for. Continuous Optimization

Save this PDF as:

Size: px
Start display at page: ## Transcription

1 4TE3/6TE3 Algorithms for Continuous Optimization (Duality in Nonlinear Optimization ) Tamás TERLAKY Computing and Software McMaster University Hamilton, January 2004 Tel: 27780

2 Optimality conditions for constrained convex optimization problems (CO) min f(x) s.t. g j (x) 0, x C. j = 1,, m F = {x C g j (x) 0, j J}. Let C 0 denote the relative interior of the convex set C. Definition 1 A vector (point) x 0 C 0 is called a Slater point of (CO) if g j (x 0 ) < 0, g j (x 0 ) 0, for all j where g j is nonlinear, for all j where g j is linear. (CO) is Slater regular or (CO) satisfies the Slater condition (in other words, (CO) satisfies the Slater constraint qualification). 1

3 Ideal Slater Point Some constraint functions g j (x) might take the value zero for all feasible points. Such constraints are called singular while the others are called regular. J s = {j J g j (x) = 0 for all x F}, J r = J J s = {j J g j (x) < 0 for some x F}. Remark: Note, that if (CO) is Slater regular, then all singular functions must be linear. Definition 2 A vector (point) x C 0 is called an Ideal Slater point of the convex optimization problem (CO), if x is a Slater point and g j (x ) < 0 for all j J r, g j (x ) = 0 for all j J s. Lemma 1 If the problem (CO) is Slater regular then there exists an ideal Slater point x F. 2

4 Convex Farkas Lemma Theorem 1 Let U IR n be a convex set and a point w IR n with w / U be given. Then there is a separating hyperplane a T w α, with a IR n, α IR such that a T u α for all u U but U is not a subset of the hyperplane {x a T x = α}. Note that the last property says that there is a u U such that a T u > α. Lemma 2 (Farkas) The convex optimization problem (CO) is given and we assume that the Slater regularity condition is satisfied. The inequality system f(x) < 0 g j (x) 0, x C. j = 1,, m (1) has no solution if and only if there exists a vector y = (y 1,, y m ) 0 such that f(x) + y j g j (x) 0 for all x C. (2) The systems (1) and (2) are called alternative systems, i.e. exactly one of them has a solution. 3

5 Proof of the Convex Farkas Lemma If (1) is solvable, then (2) cannot hold. To prove the other side: let us assume that (1) has no solution. With u = (u 0,, u m ), we define the set U IR m+1 U = {u x C with u 0 > f(x), u j g j (x) if j J r, u j = g j (x) if j J s }. U is convex (note that due to the Slater condition singular functions are linear) due to the infeasibility of (1) it does not contain the origin. Due to the separation Theorem 1 there exists a separating hyperplane defined by (y 0, y 1,, y m ) and α = 0 such that y j u j 0 for all u U (3) j=0 and for some u U one has j=0 y j u j > 0. (4) The rest of the proof is divided into four parts. I. First we prove that y 0 0 and y j 0 for all j J r. II. Secondly we establish that (3) holds for u = (f(x), g 1 (x),, g m (x)) if x C. III. Then we prove that y 0 must be positive. IV. Finally, it is shown by using induction that y j > 0 for all j J s can be reached. 4

6 Proof: Steps I and II I. First we show that y 0 0 and y j 0 for all j J r. Let us assume that y 0 < 0. Let us take an arbitrary (u 0, u 1,, u m ) U. By definition (u 0 + λ, u 1,, u m ) U for all λ 0. Hence by (3) one has λy 0 + j=0 y j u j 0 for all λ 0. For sufficiently large λ the left hand side is negative, which is a contradiction, i.e. y 0 must be nonnegative. The proof of the nonnegativity of all y j as j J r goes analogously. II. Secondly we establish that y 0 f(x) + y j g j (x) 0 for all x C. (5) This follows from the observation that for all x C and for all λ > 0 one has u = (f(x) + λ, g 1 (x),, g m (x)) U, thus y 0 (f(x) + λ) + y j g j (x) 0 for all x C. Taking the limit as λ 0 the claim follows. 5

7 Proof: Step III III. Thirdly we show that y 0 > 0. The proof is by contradiction. We already know that y 0 0. Let us assume to the contrary that y 0 = 0. Hence from (5) we have y j g j (x) + y j g j (x) = y j g j (x) 0 for all x C. j J r j J s Taking an ideal Slater point x C 0 one has whence g j (x ) = 0 if j J s, j J r y j g j (x ) 0. Since y j 0 and g j (x ) < 0 for all j J r, this implies y j = 0 for all j J r. This results in j J s y j g j (x) 0 for all x C. (6) Now, from (4), with x C such that u j = g j ( x) if i J s we have j J s y j g j ( x) > 0. (7) Because the ideal Slater point x is in the relative interior of C there exist a vector x C and 0 < λ < 1 such that x = λ x + (1 λ) x. Using that g j (x ) = 0 for j J s and that the singular functions are linear one gets 6

8 Proof: Step III cntd. 0 = j J s y j g j (x ) = j J s y j g j (λ x + (1 λ) x) = λ j J s y j g j ( x) + (1 λ) j J s y j g j ( x) > (1 λ) j J s y j g j ( x). Here the last inequality follows from (7). The inequality (1 λ) j J s y j g j ( x) < 0 contradicts (6). Hence we have proved that y 0 > 0. At this point we have (5) with y 0 > 0 and y j 0 for all j J r. Dividing by y 0 > 0 in (5) and by defining y j := y j y 0 for all j J we obtain f(x) + y j g j (x) 0 for all x C. (8) We finally show that y may be taken such that y j > 0 for all j J s. 7

9 Proof: Step IV IV. To complete the proof we show by induction on the cardinality of J s that one can make y j nonnegative for all j J s. Observe that if J s = then we are done. If J s = 1 then we apply the results proved till this point to the inequality system g s (x) < 0, g j (x) 0, j J r, x C (9) where {s} = J s. The system (9) has no solution, it satisfies the Slater condition, and therefore there exists a ŷ IR m 1 such that g s (x) + j J r ŷ j g j (x) 0 for all x C, (10) where ŷ s > 0 and ŷ j 0 for all j J r. Adding a sufficiently large positive multiple of (10) to (8) one obtains a positive coefficient for g s (x). The general inductive step goes analogously. 8

10 Proof: Step IV cntd. Assuming that the result is proved if J s = k then the result is proved for the case J s = k+1. Let s J s then J s \ {s} = k, and hence the inductive assumption applies to the system g s (x) < 0 g j (x) 0, j J s \ {s}, g j (x) 0, j J r, x C (11) By construction the system (11) has no solution, it satisfies the Slater condition, and by the inductive assumption we have a ŷ IR m 1 such that g s (x) + ŷ j g j (x) 0 for all x C. (12) j J r J s \{s} where ŷ j > 0 for all j J s and ŷ j 0 for all j J r. Adding a sufficiently large multiple of (12) to (8) one have the desired nonnegative multipliers. Remark: Note, that finally we proved slightly more than was stated. We have proved that the multipliers of all the singular constraints can be made strictly positive.!!!! Srict complementarity in LO!!!! Goldman-Tucker Theorem 9

11 Karush Kuhn Tucker theory Langrangean, Saddle point The Lagrange function: L(x, y) := f(x) + y j g j (x) (13) where x C and y 0. Note that for fixed y the Lagrangean is convex in x; for fixed x. it is linear in y. Definition 3 A vector pair ( x, ȳ) IR n+m, x C and ȳ 0 is called a saddle point of the Lagrange function L if for all x C and y 0. L( x, y) L( x, ȳ) L(x, ȳ) (14) 10

12 A saddle point lemma Lemma 3 The vector ( x, ȳ) IR n+m, x C and ȳ 0 is a saddle point of L(x, y) if and only if inf sup L(x, y) = L( x, ȳ) = sup inf L(x, y). (15) x C y 0 y 0 x C Proof: The saddle point inequality (14) easily follows from (15). On the other hand, for any (ˆx, ŷ) one has inf L(x, ŷ) L(ˆx, ŷ) sup L(ˆx, y), x C y 0 hence one can take the supremum of the left hand side and the infimum of the right hand side resulting in sup y 0 inf L(x, y) inf sup L(x, y). (16) x C x C y 0 Using the saddle point inequality (14) one obtains inf sup L(x, y) sup L( x, y) L( x, ȳ) inf L(x, ȳ) supinf L(x, y). x C y 0 y 0 x C y 0 x C (17) Combining (17) and (16) the equality (15) follows. 11

13 Karush-Kuhn-Tucker Theorem Theorem 2 The problem (CO) is given. Assume that the Slater regularity condition is satisfied. The vector x is an optimal solution of (CO) if and only if there is a vector ȳ such that ( x, ȳ) is a saddle point of the Lagrange function L. Proof: Easy: if ( x, ȳ) is a saddle point of L(x, y) then x is optimal for (CO). The proof of this part does not need any regularity condition. From the saddle point inequality (14) one has f( x) + y j g j ( x) f( x) + ȳ j g j ( x) f(x) + ȳ j g j (x) for all y 0 and for all x C. From the first inequality g j ( x) 0 for all j = 1,, m follows, hence x F is feasible for (CO). Taking the two extreme sides of the above inequality and substituting y = 0 we have f( x) f(x) + for all x F, i.e. x is optimal. ȳ j g j (x) f(x) 12

14 KKT proof cntd. To prove the other direction we need Slater regularity and the Convex Farkas Lemma 2. Let us take an optimal solution x of the convex optimization problem (CO). Then the inequality system f(x) f( x) < 0 g j (x) 0, x C j = 1,, m is infeasible. Applying the Convex Farkas Lemma 2 one has ȳ 0 such that f(x) f( x) + ȳ j g j (x) 0 for all x C. Using that x is feasible one easily derive the saddle point inequality f( x) + y j g j ( x) f( x) f(x) + which completes the proof. ȳ j g j (x) 13

15 KKT Corollaries Corollary 1 Under the assumptions of Theorem 2 the vector x C is an optimal solution of (CO) if and only if there exists a ȳ 0 such that (i) (ii) m f( x) = min {f(x) + ȳ j g j (x)} x C m ȳ j g j ( x) = max { y j g j ( x)}. y 0 and Corollary 2 Under the assumptions of Theorem 2 the vector x F is an optimal solution of (CO) if and only if there exists a ȳ 0 such that (i) (ii) m f( x) = min {f(x) + ȳ j g j (x)} x C ȳ j g j ( x) = 0. and Corollary 3 Let us assume that C = IR n and the functions f, g 1,, g m are continuously differentiable functions. Under the assumptions of Theorem 2 the vector x F is an optimal solution of (CO) if and only if there exists a ȳ 0 such that (i) 0 = f( x) + ȳ j g j ( x) and (ii) ȳ j g j ( x) = 0. 14

16 KKT point Definition 4 Let us assume that C = IR n and the functions f, g 1,, g m are continuously differentiable functions. The vector ( x, ȳ) IR n+m is called a Karush Kuhn Tucker (KKT) point of (CO) if (i) g j ( x) 0, for all j J, (ii) 0 = f( x) + (iii) (iv) ȳ 0. ȳ j g j ( x) = 0, ȳ j g j ( x) Corollary 4 Let us assume that C = IR n and the functions f, g 1,, g m are continuously differentiable convex functions and the assumptions of Theorem 2 hold. Let the vector x F be a KKT point, then x is an optimal solution of (CO). 15

17 Duality in CO Lagrange dual Definition 5 Denote The problem m ψ(y) = inf {f(x) + y j g j (x)}. x C (LD) sup ψ(y) y 0 is called the Lagrange Dual of problem (CO). Lemma 4 The Lagrange Dual (LD) of (CO) is a convex optimization problem, even if the functions f, g 1,, g m are not convex. Proof: ψ(y) is concave! Let ȳ, ŷ 0 and 0 λ 1. m ψ(λȳ + (1 λ)ŷ) = inf {f(x) + (λȳ j + (1 λ)ŷ j )g j (x)} x C = inf x C {λ[f(x) + m inf x C {λ[f(x) + m = λψ(ȳ) + (1 λ)ψ(ŷ). ȳ j g j (x)] + (1 λ)[f(x) + ŷ j g j (x)]} ȳ j g j (x)]} + inf x C {(1 λ)[f(x) + m ŷ j g j (x)]} 16

18 Results on the Lagrange dual Theorem 3 (Weak duality) If x is a feasible solution of (CO) and ȳ 0 then ψ(ȳ) f( x) and equality hold if and only if m inf {f(x) + ȳ j g j (x)} = f( x). x C Proof: ψ(ȳ) = inf x C {f(x) + m ȳ j g j (x)} f( x) + ȳ j g j ( x) f( x). Equality holds iff inf x C {f(x)+ m ȳjg j (x)} = f( x) and hence ȳ j g j (x) = 0 for all j J. Corollary 5 If x is a feasible solution of (CO), ȳ 0 and ψ(ȳ) = f( x) then the vector x is an optimal solution of (CO) and ȳ is optimal for (LD). Further if the functions f, g 1,, g m are continuously differentiable then ( x, ȳ) is a KKT-point. Theorem 4 (Strong duality) Assume that (CO) satisfies the Slater regularity condition. Let x be a feasible solution of (CO). The vector x is an optimal solution of (CO) if and only if there exists a ȳ 0 such that ȳ is an optimal solution of (LD) and ψ(ȳ) = f( x). 17

19 Wolfe dual Definition 6 Assume that C = IR n and the functions f, g 1,, g m are continuously differentiable and convex. The problem (WD) sup{f(x) + f(x) + y j g j (x)} y j g j (x) = 0, y 0 is called the Wolfe Dual of the convex optimization problem (CO). Warning! Remember, we are only allowed to form the Wolfe dual of a nonlinear optimization problem if it is convex! For nonconvex problems one has to work with the Lagrange dual. 18

20 Examples for dual problems Linear optimization (LO) min{c T x Ax = b, x 0}. g j (x) = (a j ) T x b j if j = 1,, m; g j (x) = ( a j m ) T x + b j m if j = m + 1,,2m; g j (x) = x j 2m if j = 2m + 1,,2m + n. Denote Lagrange multipliers by y, y + and s, then the Wolfe dual (WD) of (LO) is: max c T x + (y ) T (Ax b) + (y + ) T ( Ax + b) + s T ( x) c + A T y A T y + s = 0, y 0, y + 0, s 0. Substitute c = A T y + A T y + + s and let y = y + y then max b T y A T y + s = c, s 0. Note that the KKT conditions provide the well known complementary slackness condition x T s = 0. 19

21 Quadratic optimization (QO) min{c T x xt Qx Ax b, x 0}. g j (x) = ( a j ) T x + b j if j = 1,, m; g j (x) = x j m if j = m + 1,, m + n. Lagrange multipliers: y and s. The Wolfe dual (WD) of (QO) is: max c T x xt Qx + y T ( Ax + b) + s T ( x) c + Qx A T y s = 0, y 0, s 0. Substitute c = Qx + A T y + s in the objective. max b T y 1 2 xt Qx Qx + A T y + s = c, y 0, s 0. Q = D T D (e.g. Cholesky), let z = Dx. The following (QD) dual problem is obtained: max b T y 1 2 zt z D T z + A T y + s = c, y 0, s 0. 20

22 Constrained maximum likelihood estimation Finite set of sample points x i, (1 i n). Find the most probable density values that satisfy some linear (e.g. convexity) constraints. Maximize the Likelihood function Π n i=1 x i under the conditions Ax 0, d T x = 1, x 0. The objective can equivalently replaced by n min ln x i. i=1 Lagrange multipliers: y IR m, t IR and s IR n. The Wolfe dual (WD) is: n max ln x i + y T ( Ax) + t(d T x 1) + s T ( x) i=1 X 1 e A T y + td s = 0, y 0, s 0. Multiplying the first constraint by x T one has x T X 1 e x T A T y + tx T d x T s = 0. Using d T x = 1, x T X 1 e = n and the optimality conditions y T Ax = 0, x T s = 0 we have t = n. x is necessarily strictly positive, hence the dual variable s must be zero at optimum. 21

23 Constrained maximum likelihood estimation: 2 max n i=1 ln x i X 1 e + A T y = nd, y 0. Eliminating the variables x i > 0: x i = 1 nd i a T i y and ln x i = ln(nd i a T i y) i. max n i=1 ln(nd i a T i y) A T y nd, y 0. 22

24 Example: positive duality gap Duffin s convex optimization problem (CO) min e x 2 s.t. x x2 2 x 1 0 x IR 2. The feasible region is F = {x IR 2 x 1 0, x 2 = 0}. (CO) is not Slater regular. The optimal value of the object function is 1. The Lagrange function is given by L(x, y) = e x 2 + y( x x2 2 x 1). Now, let ǫ = x x2 2 x 1, then x 2 2 2ǫx 1 ǫ 2 = 0. Hence, for any ǫ > 0 we can find x 1 > 0 such that ǫ = x x2 2 x 1 even if x 2 goes to infinity. However, when x 2 goes to infinity e x 2 goes to 0. So, ( ) ψ(y) = inf e x 2 + y x 2 x IR x2 2 x 1 = 0, 23

25 thus the optimal value of the Lagrange dual (LD) max ψ(y) s.t. y 0 is 0. Nonzero duality gap that equals to 1!

26 Example: infinite duality gap Duffin s example slightly modified min x 2 s.t. x x2 2 x 1 0. The feasible region is F = {x IR 2 x 1 0, x 2 = 0}. The problem is not Slater regular. The optimal value of the object function is 0. The Lagrange function is given by So, L(x, y) = x 2 + y( ψ(y) = inf x IR 2 { x 2 + y x x2 2 x 1). ( x 21 + x 22 x 1 )} =, thus the optimal value of the Lagrange dual (LD) max ψ(y) s.t. y 0 is, because ψ(y) is minus infinity! 24

27 Semidefinite optimization Let A 0, A 1,, A n IR m m, let c IR n, x IR n. The primal SDO problem is defined as (PSO) min c T x (18) n s.t. A 0 + A k x k 0, F(x) = A 0 + n k=1 A kx k The dual SDO problem is: k=1 (DSP) max Tr(A 0 Z) (19) s.t. Tr(A k Z) = c k, for all k = 1,, n, Z 0, Theorem 5 (Weak duality) If x IR n is primal feasible and Z IR m m is dual feasible, then c T x Tr(A 0 Z) with equality if and only if F(x)Z = 0. Proof: c T x Tr(A 0 Z) = n k=1 Tr(A k Z)x k Tr(A 0 Z) = Tr(( n k=1 A k x k A 0 )Z) = Tr(F(x)Z) 0. 25

28 SDO Dual as Lagrange Dual (PSO ) min c T x (20) s.t. F(x) + S = 0 S 0, The Lagrange function L(x, S, Z) is defined on {(x, S, Z) x IR n, S IR m m, S 0, Z IR m m, } L(x, S, Z) = c T x e T (F(x) Z)e + e T (S Z)e, X Z is the Minkowski product, e T (S Z)e = Tr(SZ). n L(x, S, Z) = c T x x k Tr(A k Z) + Tr(A 0 Z) + Tr(SZ). (21) k=1 (DSDL) max {ψ(z) Z IR m m } (22) ψ(z) = min{ L(x, S, Z) x IR n, S IR m m, S 0}. (23) Optimality conditions to get ψ(z): min S Tr(SZ) = { 0 if Z 0, otherwise. (24) 26

29 SDO dual cntd. If we minimize (23) in x, we need to equate the x gradient of L(x, S, Z) to zero: c k Tr(A k Z) = 0 for all k = 1,, n. (25) Multiplying (25) by x k and summing up c T x n k=1 x k Tr(A k Z) = 0. By combining the last formula and the results presented in (24) and in (25) the simplified form of the Lagrange dual (22), the Lagrange Wolfe dual (DSO) max Tr(A 0 Z) s.t. Tr(A k Z) = c k, for all k = 1,, n, Z 0, follows. It is identical to (19). 27

30 Duality in cone-linear optimization min c T x Ax b C 1 (26) x C 2. Definition 7 Let C IR n be a convex cone. The dual cone (polar, positive polar) C is defined by C := { z IR n x T z 0 for all x C}. The Dual of a Cone-linear Problem min c T x s Ax + b = 0 s C 1 x C 2. Linear equality constraints s Ax+b = 0 and (s, x) must be in the convex cone C 1 C 2 := {(s, x) s C 1, x C 2 }. The Lagrange function L(s, x, y) is defined on and is given by {(s, x, y) s C 1, x C 2, y IR m } L(s, x, y) = c T x + y T (s Ax + b) = b T y + s T y + x T (c A T y). 28

31 Thee Lagrange dual of the cone-linear problem where max y IR ψ(y) m ψ(y) = min{ L(s, x, y) s C 1, x C 2 }. The vector s is in the cone C 1, hence { min s T 0 if y C y = 1, s C 1 otherwise. min x T (c A T y) = x C 2 { 0 if c A T y C2, otherwise. By combining these: we have { b ψ(y) = T y if y C1 and c AT y C2, otherwise. Thus the dual of (26) is: max b T y c A T y C2 y C1. 29

### Nonlinear Programming Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week

### I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010 I.3. LMI DUALITY Didier HENRION henrion@laas.fr EECI Graduate School on Control Supélec - Spring 2010 Primal and dual For primal problem p = inf x g 0 (x) s.t. g i (x) 0 define Lagrangian L(x, z) = g 0

### Nonlinear Optimization Nonlinear Optimization Etienne de Klerk (UvT)/Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos Course WI3031 (Week 4) February-March, A.D. 2005 Optimization Group 1 Outline

### Chap 2. Optimality conditions Chap 2. Optimality conditions Version: 29-09-2012 2.1 Optimality conditions in unconstrained optimization Recall the definitions of global, local minimizer. Geometry of minimization Consider for f C 1

### Convex Optimization & Lagrange Duality Convex Optimization & Lagrange Duality Chee Wei Tan CS 8292 : Advanced Topics in Convex Optimization and its Applications Fall 2010 Outline Convex optimization Optimality condition Lagrange duality KKT

### 5. Duality. Lagrangian 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

### Lecture 18: Optimization Programming Fall, 2016 Outline Unconstrained Optimization 1 Unconstrained Optimization 2 Equality-constrained Optimization Inequality-constrained Optimization Mixture-constrained Optimization 3 Quadratic Programming

### 4TE3/6TE3. Algorithms for. Continuous Optimization 4TE3/6TE3 Algorithms for Continuous Optimization (Algorithms for Constrained Nonlinear Optimization Problems) Tamás TERLAKY Computing and Software McMaster University Hamilton, November 2005 terlaky@mcmaster.ca

### Convex Optimization M2 Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization

### Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The

### Convex Optimization Boyd & Vandenberghe. 5. Duality 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

### Lecture: Duality. Lecture: Duality http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghe s lecture notes Introduction 2/35 Lagrange dual problem weak and strong

### Lecture: Duality of LP, SOCP and SDP 1/33 Lecture: Duality of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:

### 14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness. CS/ECE/ISyE 524 Introduction to Optimization Spring 2016 17 14. Duality ˆ Upper and lower bounds ˆ General duality ˆ Constraint qualifications ˆ Counterexample ˆ Complementary slackness ˆ Examples ˆ Sensitivity

### EE/AA 578, Univ of Washington, Fall Duality 7. Duality EE/AA 578, Univ of Washington, Fall 2016 Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

### Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities Duality Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities Lagrangian Consider the optimization problem in standard form

### Optimality, Duality, Complementarity for Constrained Optimization Optimality, Duality, Complementarity for Constrained Optimization Stephen Wright University of Wisconsin-Madison May 2014 Wright (UW-Madison) Optimality, Duality, Complementarity May 2014 1 / 41 Linear

### Optimization for Machine Learning Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

### Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

### Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem: CDS270 Maryam Fazel Lecture 2 Topics from Optimization and Duality Motivation network utility maximization (NUM) problem: consider a network with S sources (users), each sending one flow at rate x s, through

### 1 Review of last lecture and introduction Semidefinite Programming Lecture 10 OR 637 Spring 2008 April 16, 2008 (Wednesday) Instructor: Michael Jeremy Todd Scribe: Yogeshwer (Yogi) Sharma 1 Review of last lecture and introduction Let us first

### The Karush-Kuhn-Tucker (KKT) conditions The Karush-Kuhn-Tucker (KKT) conditions In this section, we will give a set of sufficient (and at most times necessary) conditions for a x to be the solution of a given convex optimization problem. These

### Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu

### Conic Linear Optimization and its Dual. yyye Conic Linear Optimization and Appl. MS&E314 Lecture Note #04 1 Conic Linear Optimization and its Dual Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A.

### Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

### Lagrangian Duality Theory Lagrangian Duality Theory Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye Chapter 14.1-4 1 Recall Primal and Dual

### Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST) Lagrange Duality Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline of Lecture Lagrangian Dual function Dual

### Introduction to Nonlinear Stochastic Programming School of Mathematics T H E U N I V E R S I T Y O H F R G E D I N B U Introduction to Nonlinear Stochastic Programming Jacek Gondzio Email: J.Gondzio@ed.ac.uk URL: http://www.maths.ed.ac.uk/~gondzio SPS

### Duality Theory of Constrained Optimization Duality Theory of Constrained Optimization Robert M. Freund April, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 2 1 The Practical Importance of Duality Duality is pervasive

### Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers Optimization for Communications and Networks Poompat Saengudomlert Session 4 Duality and Lagrange Multipliers P Saengudomlert (2015) Optimization Session 4 1 / 14 24 Dual Problems Consider a primal convex

### Generalization to inequality constrained problem. Maximize Lecture 11. 26 September 2006 Review of Lecture #10: Second order optimality conditions necessary condition, sufficient condition. If the necessary condition is violated the point cannot be a local minimum

### Lectures 9 and 10: Constrained optimization problems and their optimality conditions Lectures 9 and 10: Constrained optimization problems and their optimality conditions Coralia Cartis, Mathematical Institute, University of Oxford C6.2/B2: Continuous Optimization Lectures 9 and 10: Constrained

### UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems Robert M. Freund February 2016 c 2016 Massachusetts Institute of Technology. All rights reserved. 1 1 Introduction

### Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research Introduction to Machine Learning Lecture 7 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Convex Optimization Differentiation Definition: let f : X R N R be a differentiable function,

### Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

### 4. Algebra and Duality 4-1 Algebra and Duality P. Parrilo and S. Lall, CDC 2003 2003.12.07.01 4. Algebra and Duality Example: non-convex polynomial optimization Weak duality and duality gap The dual is not intrinsic The cone

### Numerical Optimization Constrained Optimization Computer Science and Automation Indian Institute of Science Bangalore 560 012, India. NPTEL Course on Constrained Optimization Constrained Optimization Problem: min h j (x) 0,

### CSCI : Optimization and Control of Networks. Review on Convex Optimization CSCI7000-016: Optimization and Control of Networks Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one

### Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725 Karush-Kuhn-Tucker Conditions Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given a minimization problem Last time: duality min x subject to f(x) h i (x) 0, i = 1,... m l j (x) = 0, j =

### A Brief Review on Convex Optimization A Brief Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one convex, two nonconvex sets): A Brief Review

### Introduction to Mathematical Programming IE406. Lecture 10. Dr. Ted Ralphs Introduction to Mathematical Programming IE406 Lecture 10 Dr. Ted Ralphs IE406 Lecture 10 1 Reading for This Lecture Bertsimas 4.1-4.3 IE406 Lecture 10 2 Duality Theory: Motivation Consider the following

### Lecture 3. Optimization Problems and Iterative Algorithms Lecture 3 Optimization Problems and Iterative Algorithms January 13, 2016 This material was jointly developed with Angelia Nedić at UIUC for IE 598ns Outline Special Functions: Linear, Quadratic, Convex

### Support Vector Machines: Maximum Margin Classifiers Support Vector Machines: Maximum Margin Classifiers Machine Learning and Pattern Recognition: September 16, 2008 Piotr Mirowski Based on slides by Sumit Chopra and Fu-Jie Huang 1 Outline What is behind

### Lagrangian Duality and Convex Optimization Lagrangian Duality and Convex Optimization David Rosenberg New York University February 11, 2015 David Rosenberg (New York University) DS-GA 1003 February 11, 2015 1 / 24 Introduction Why Convex Optimization?

### CO 250 Final Exam Guide Spring 2017 CO 250 Final Exam Guide TABLE OF CONTENTS richardwu.ca CO 250 Final Exam Guide Introduction to Optimization Kanstantsin Pashkovich Spring 2017 University of Waterloo Last Revision: March 4,

### Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 4 Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 2 4.1. Subgradients definition subgradient calculus duality and optimality conditions Shiqian

### Linear and Combinatorial Optimization Linear and Combinatorial Optimization The dual of an LP-problem. Connections between primal and dual. Duality theorems and complementary slack. Philipp Birken (Ctr. for the Math. Sc.) Lecture 3: Duality

### Lecture 6: Conic Optimization September 8 IE 598: Big Data Optimization Fall 2016 Lecture 6: Conic Optimization September 8 Lecturer: Niao He Scriber: Juan Xu Overview In this lecture, we finish up our previous discussion on optimality conditions

### Lagrange duality. The Lagrangian. We consider an optimization program of the form Lagrange duality Another way to arrive at the KKT conditions, and one which gives us some insight on solving constrained optimization problems, is through the Lagrange dual. The dual is a maximization

### Finite Dimensional Optimization Part III: Convex Optimization 1 John Nachbar Washington University March 21, 2017 Finite Dimensional Optimization Part III: Convex Optimization 1 1 Saddle points and KKT. These notes cover another important approach to optimization,

### Summer School: Semidefinite Optimization Summer School: Semidefinite Optimization Christine Bachoc Université Bordeaux I, IMB Research Training Group Experimental and Constructive Algebra Haus Karrenberg, Sept. 3 - Sept. 7, 2012 Duality Theory

### Lecture 5. Theorems of Alternatives and Self-Dual Embedding IE 8534 1 Lecture 5. Theorems of Alternatives and Self-Dual Embedding IE 8534 2 A system of linear equations may not have a solution. It is well known that either Ax = c has a solution, or A T y = 0, c

### CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS A Dissertation Submitted For The Award of the Degree of Master of Philosophy in Mathematics Neelam Patel School of Mathematics

### LP Duality: outline. Duality theory for Linear Programming. alternatives. optimization I Idea: polyhedra LP Duality: outline I Motivation and definition of a dual LP I Weak duality I Separating hyperplane theorem and theorems of the alternatives I Strong duality and complementary slackness I Using duality

### The Karush-Kuhn-Tucker conditions Chapter 6 The Karush-Kuhn-Tucker conditions 6.1 Introduction In this chapter we derive the first order necessary condition known as Karush-Kuhn-Tucker (KKT) conditions. To this aim we introduce the alternative

### Linear and non-linear programming Linear and non-linear programming Benjamin Recht March 11, 2005 The Gameplan Constrained Optimization Convexity Duality Applications/Taxonomy 1 Constrained Optimization minimize f(x) subject to g j (x)

### Homework Set #6 - Solutions EE 15 - Applications of Convex Optimization in Signal Processing and Communications Dr Andre Tkacenko JPL Third Term 11-1 Homework Set #6 - Solutions 1 a The feasible set is the interval [ 4] The unique

### Convex Optimization and Modeling Convex Optimization and Modeling Duality Theory and Optimality Conditions 5th lecture, 12.05.2010 Jun.-Prof. Matthias Hein Program of today/next lecture Lagrangian and duality: the Lagrangian the dual

### Lagrange Relaxation and Duality Lagrange Relaxation and Duality As we have already known, constrained optimization problems are harder to solve than unconstrained problems. By relaxation we can solve a more difficult problem by a simpler

### 10 Numerical methods for constrained problems 10 Numerical methods for constrained problems min s.t. f(x) h(x) = 0 (l), g(x) 0 (m), x X The algorithms can be roughly divided the following way: ˆ primal methods: find descent direction keeping inside

### Inequality Constraints Chapter 2 Inequality Constraints 2.1 Optimality Conditions Early in multivariate calculus we learn the significance of differentiability in finding minimizers. In this section we begin our study of the

### Lecture 7: Convex Optimizations Lecture 7: Convex Optimizations Radu Balan, David Levermore March 29, 2018 Convex Sets. Convex Functions A set S R n is called a convex set if for any points x, y S the line segment [x, y] := {tx + (1

### Additional Homework Problems Additional Homework Problems Robert M. Freund April, 2004 2004 Massachusetts Institute of Technology. 1 2 1 Exercises 1. Let IR n + denote the nonnegative orthant, namely IR + n = {x IR n x j ( ) 0,j =1,...,n}.

### Part IB Optimisation Part IB Optimisation Theorems Based on lectures by F. A. Fischer Notes taken by Dexter Chua Easter 2015 These notes are not endorsed by the lecturers, and I have modified them (often significantly) after

### The fundamental theorem of linear programming The fundamental theorem of linear programming Michael Tehranchi June 8, 2017 This note supplements the lecture notes of Optimisation The statement of the fundamental theorem of linear programming and the

### CS-E4830 Kernel Methods in Machine Learning CS-E4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This

### Enhanced Fritz John Optimality Conditions and Sensitivity Analysis Enhanced Fritz John Optimality Conditions and Sensitivity Analysis Dimitri P. Bertsekas Laboratory for Information and Decision Systems Massachusetts Institute of Technology March 2016 1 / 27 Constrained

### Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014 Convex Optimization Dani Yogatama School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA February 12, 2014 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12,

### subject to (x 2)(x 4) u, Exercises Basic definitions 5.1 A simple example. Consider the optimization problem with variable x R. minimize x 2 + 1 subject to (x 2)(x 4) 0, (a) Analysis of primal problem. Give the feasible set, the

### ICS-E4030 Kernel Methods in Machine Learning ICS-E4030 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 28. September, 2016 Juho Rousu 28. September, 2016 1 / 38 Convex optimization Convex optimisation This

### On the Method of Lagrange Multipliers On the Method of Lagrange Multipliers Reza Nasiri Mahalati November 6, 2016 Most of what is in this note is taken from the Convex Optimization book by Stephen Boyd and Lieven Vandenberghe. This should

### Example: feasibility. Interpretation as formal proof. Example: linear inequalities and Farkas lemma 4-1 Algebra and Duality P. Parrilo and S. Lall 2006.06.07.01 4. Algebra and Duality Example: non-convex polynomial optimization Weak duality and duality gap The dual is not intrinsic The cone of valid

### Lecture 8. Strong Duality Results. September 22, 2008 Strong Duality Results September 22, 2008 Outline Lecture 8 Slater Condition and its Variations Convex Objective with Linear Inequality Constraints Quadratic Objective over Quadratic Constraints Representation

### LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization

### Midterm Review. Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. Midterm Review Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye (LY, Chapter 1-4, Appendices) 1 Separating hyperplane

### Support Vector Machines Support Vector Machines Support vector machines (SVMs) are one of the central concepts in all of machine learning. They are simply a combination of two ideas: linear classification via maximum (or optimal

### Convex Optimization Lecture 6: KKT Conditions, and applications Convex Optimization Lecture 6: KKT Conditions, and applications Dr. Michel Baes, IFOR / ETH Zürich Quick recall of last week s lecture Various aspects of convexity: The set of minimizers is convex. Convex

### Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version Convex Optimization Theory Chapter 5 Exercises and Solutions: Extended Version Dimitri P. Bertsekas Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts http://www.athenasc.com

### LECTURE 10 LECTURE OUTLINE LECTURE 10 LECTURE OUTLINE Min Common/Max Crossing Th. III Nonlinear Farkas Lemma/Linear Constraints Linear Programming Duality Convex Programming Duality Optimality Conditions Reading: Sections 4.5, 5.1,5.2,

### In view of (31), the second of these is equal to the identity I on E m, while this, in view of (30), implies that the first can be written 11.8 Inequality Constraints 341 Because by assumption x is a regular point and L x is positive definite on M, it follows that this matrix is nonsingular (see Exercise 11). Thus, by the Implicit Function

### Optimization Problems with Constraints - introduction to theory, numerical Methods and applications Optimization Problems with Constraints - introduction to theory, numerical Methods and applications Dr. Abebe Geletu Ilmenau University of Technology Department of Simulation and Optimal Processes (SOP)

### Miscellaneous Nonlinear Programming Exercises Miscellaneous Nonlinear Programming Exercises Henry Wolkowicz 2 08 21 University of Waterloo Department of Combinatorics & Optimization Waterloo, Ontario N2L 3G1, Canada Contents 1 Numerical Analysis Background

### Support Vector Machines for Regression COMP-566 Rohan Shah (1) Support Vector Machines for Regression Provided with n training data points {(x 1, y 1 ), (x 2, y 2 ),, (x n, y n )} R s R we seek a function f for a fixed ɛ > 0 such that: f(x

### Nonlinear Optimization: What s important? Nonlinear Optimization: What s important? Julian Hall 10th May 2012 Convexity: convex problems A local minimizer is a global minimizer A solution of f (x) = 0 (stationary point) is a minimizer A global

### Lecture Note 5: Semidefinite Programming for Stability Analysis ECE7850: Hybrid Systems:Theory and Applications Lecture Note 5: Semidefinite Programming for Stability Analysis Wei Zhang Assistant Professor Department of Electrical and Computer Engineering Ohio State

### Lecture Notes on Support Vector Machine Lecture Notes on Support Vector Machine Feng Li fli@sdu.edu.cn Shandong University, China 1 Hyperplane and Margin In a n-dimensional space, a hyper plane is defined by ω T x + b = 0 (1) where ω R n is

### Some Properties of the Augmented Lagrangian in Cone Constrained Optimization MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented

### Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Module - 5 Lecture - 22 SVM: The Dual Formulation Good morning.

### Solving generalized semi-infinite programs by reduction to simpler problems. Solving generalized semi-infinite programs by reduction to simpler problems. G. Still, University of Twente January 20, 2004 Abstract. The paper intends to give a unifying treatment of different approaches

### Math 5311 Constrained Optimization Notes ath 5311 Constrained Optimization otes February 5, 2009 1 Equality-constrained optimization Real-world optimization problems frequently have constraints on their variables. Constraints may be equality

### ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints Instructor: Prof. Kevin Ross Scribe: Nitish John October 18, 2011 1 The Basic Goal The main idea is to transform a given constrained

### On the Existence and Convergence of the Central Path for Convex Programming and Some Duality Results Computational Optimization and Applications, 10, 51 77 (1998) c 1998 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands. On the Existence and Convergence of the Central Path for Convex

### Nonlinear Programming and the Kuhn-Tucker Conditions Nonlinear Programming and the Kuhn-Tucker Conditions The Kuhn-Tucker (KT) conditions are first-order conditions for constrained optimization problems, a generalization of the first-order conditions we

### Date: July 5, Contents 2 Lagrange Multipliers Date: July 5, 2001 Contents 2.1. Introduction to Lagrange Multipliers......... p. 2 2.2. Enhanced Fritz John Optimality Conditions...... p. 14 2.3. Informative Lagrange Multipliers...........

### Optimisation in Higher Dimensions CHAPTER 6 Optimisation in Higher Dimensions Beyond optimisation in 1D, we will study two directions. First, the equivalent in nth dimension, x R n such that f(x ) f(x) for all x R n. Second, constrained

### A semidefinite relaxation scheme for quadratically constrained quadratic problems with an additional linear constraint Iranian Journal of Operations Research Vol. 2, No. 2, 20, pp. 29-34 A semidefinite relaxation scheme for quadratically constrained quadratic problems with an additional linear constraint M. Salahi Semidefinite 1 Optimization Mathematical programming refers to the basic mathematical problem of finding a maximum to a function, f, subject to some constraints. 1 In other words, the objective is to find a point, 239 A DUALITY THEOREM FOR N-LINEAR PROGRAMMING* BY PHILIP WOLFE The RAND Corporation Summary. A dual problem is formulated for the mathematical programming problem of minimizing a convex function under