Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1
Outline for week 5 The Lagrange-dual problem of (CO); Weak duality Strong duality The Wolfe-dual problem; Examples Special cases: linear and quadratic optimization. In the course notes: Chapter 3 up to and including 3.4. Optimization Group 2
The Lagrangean (from Week 4) (CO) min {f(x) : x C, g j (x) 0, j = 1,, m} The Lagrange function (or Langrangean) of (CO) is given by: L(x, y) = f(x) + m y j g j (x), x C, y 0 Recall that L(x, y) is convex in x and linear in y. If F = { x C : g j (x) 0, j = 1,, m } then Hence m y j g j (x) 0, x F, y 0. L(x, y) f(x), x F, y 0. Equality holds if and only if m y j g j (x) = 0 which is equivalent to y j g j (x) = 0, j = 1,..., m. Optimization Group 3
With the problem The Lagrange-dual problem m ψ(y) := inf {f(x) + y j g j (x)} x C sup {ψ(y) : y 0} is the so-called Lagrange-dual problem of (CO). The Lagrange-dual problem is also defined in this way if (CO) not convex. The following theorem holds also in that case. Theorem 1 (weak duality) sup {ψ(y) : y 0} inf x C { f(x) : gj (x) 0, j = 1,..., m }. Proof: We have (trivially!) L(x, y) f(x), x F, y 0, and since F C, for each y 0, ψ(y) = inf L(x, y) inf x C x F Thus the theorem follows. L(x, y) inf x F f(x). The optimal values may be different! However, they are equal if (CO) satisfies the Slater condition and has finite optimal value. This is the next result. Optimization Group 4
The Lagrange-dual problem: condition for vanishing fuality gap Theorem 2 (strong duality) If (CO) satisfies the Slater condition and has finite optimal value, then { sup {ψ(y) : y 0} = inf f(x) : gj (x) 0, j = 1,..., m }. x C Moreover, then the dual optimal value is attained. Proof: Let z be the optimal value of (CO). Taking a = z in the Convex Farkas Lemma (Week 4) it follows that there exists a vector y = (y1 ;...;y m ) 0 such that L(x, y ) = f(x) + m y j g j(x) z, x C. By the definition of ψ(y ) this implies ψ(y ) z. Using the weak duality theorem, it follows that ψ(y ) = z. This not only proves that the optimal values are equal, but also that y is an optimal solution of the dual problem. Optimization Group 5
Convexity of the (general) Lagrange-dual problem The Lagrange-dual problem is given by where sup {ψ(y) : y 0} ψ(y) = inf L(x, y). x C Using that L(x, y) is linear in y, we may write, for any y 1 0, y 2 0 and 0 λ 1, ψ(λy 1 + (1 λ)y 2 ) = inf x C L(x, λy 1 + (1 λ)y 2 ) = inf x C (λl(x, y 1 ) + (1 λ)l(x, y 2 )) inf x C λl(x, y 1 ) + inf x C (1 λ)l(x, y 2 ) = λinf x C L(x, y 1 ) + (1 λ)inf x C L(x, y 2 ) = λψ(y 1 ) + (1 λ)ψ(y 2 ), proving that ψ(y) is concave in y. N.B. The problem is called convex because it is equivalent to the convex problem inf { ψ(y) : y 0}. Optimization Group 6
Example Consider min { f(x 1, x 2 ) = x 2 1 + x2 2 : x 1 + x 2 1, (x 1, x 2 ) R 2}. Now L(x, y) = x 2 1 + x2 2 + y (1 x 1 x 2 ), (x 1, x 2 ) R 2, y 0. Since L(x, y) is convex in x, L(x, y) is minimal if and only if x L(x, y) = 0. This holds if 2x 1 y = 0 and 2x 2 y = 0, which has a solution, namely Substitution gives So the dual problem is ψ(y) = y2 2 max x 1 = y 2, x 2 = y 2. { + y (1 y) = y y2 2. y y2 2 : y 0 The optimal value of y is 1. So x 1 = x 2 = 1 2 is the optimal solution of the original (primal) problem. In both cases the optimal value equals 1 2. So, at optimality the duality gap is zero! }. Optimization Group 7
Example (CO) min x s.t. x 2 0 x R. This (CO) problem is not Slater regular. On the other hand, we have ψ(y) = inf x R (x + yx 2 ) = We see that ψ(y) < 0 for all y 0. One has sup {ψ(y) : y 0} = 0. 1 2y for y > 0 for y = 0. So the Lagrange-dual has the same optimal value as the primal problem. In spite of the lack of Slater regularity there is no duality gap. Optimization Group 8
Example with positive duality gap Consider (CO) min e x 2 s.t. x 2 1 + x2 2 x 1 0 x R 2. The feasible region is F = {x R 2 x 1 0, x 2 = 0}. This makes clear that (CO) is not Slater regular. The optimal value of the object function is 1. The Lagrange function is given by L(x, y) = e x 2 + y( The Lagrange-dual is (LD) sup ψ(y)) s.t. y 0 x 2 1 + x2 2 x 1). Optimization Group 9
Example (cont.) L(x, y) = e x 2 + y( x 2 1 + x2 2 x 1) Note that L(x, y) > 0. Hence ψ(y) 0. Now let ǫ > 0. Taking x 2 = ln ǫ and one has x 1 = x2 2 ǫ2, 2ǫ x 2 1 + x2 2 x 1 = ǫ. Hence, for these values of x 1 and x 2 we have L(x, y) = ǫ + yǫ = (1 + y)ǫ. Therefore, ψ(y) = inf L(x, y) inf(1 + y)ǫ = 0. x R2 ǫ>0 Since we also have ψ(y) 0, we conclude that the optimal value of the Lagrange-dual (LD) sup {ψ(y) : y 0} is 0, and hence the minimal duality gap equals 1! ( No strong duality here.) Optimization Group 10
Recall that the Lagrange-dual of (CO) is: sup y 0 The Wolfe-dual inf x C {f(x) + m y j g j (x)} Assume that C = R n and the functions f, g 1,, g m are continuously differentiable and convex. For a given y 0 the inner minimization problem is convex, and we can use the fact that the infimum is attained if and only if the gradient with respect to x is zero. The problem is called the Wolfe-Dual of (CO). (W D) sup x,y {f(x) + m f(x) + m y j g j (x)} y j g j (x) = 0, y 0 N.B. The equation f(x) + m y j g j (x) = 0 is usually nonlinear. In such cases the Wolfe-dual is not convex! Optimization Group 11
Consider the convex optimization problem Example: Wolfe-dual (CO) min x x 1 + e x 2 s.t. 3x 1 2e x 2 10 x 2 0 x R 2. Then the optimal value is 5 with x = (4,0). The Wolfe-dual of (CO) is (WD) sup x,y x 1 + e x 2 + y 1 (10 3x 1 + 2e x 2) y 2 x 2 s.t. 1 3y 1 = 0 e x 2 + 2e x 2y 1 y 2 = 0 x R 2, y 0, which is a non-convex problem. The first constraint gives y 1 = 1 3, and thus the second constraint becomes 5 3 ex 2 y 2 = 0. Optimization Group 12
Example: Wolfe-dual (cont.) Now we can eliminate y 1 and y 2 from the object function. We get the function f(x 2 ) = 5 3 ex 2 5 3 x 2e x 2 + 10 3. This function has a maximum when f (x 2 ) = 5 3 x 2e x 2 = 0, which is only true when x 2 = 0 and f(0) = 5. Hence the optimal value of (WD) is 5 and then (x, y) = (4,0, 1 3, 5 3 ). Remark: The substitution y = e x 2 1 makes the problem linear. Optimization Group 13
The Wolfe-dual: weak duality property Theorem 3.9 Assume that C = R n and the functions f, g 1,, g m are continuously differentiable and convex. If ˆx is a feasible solution of (CO) and ( x, ȳ) is a feasible solution for (WD) then L( x, ȳ) f(ˆx). In other words, weak duality holds for (CO) and (WD). Optimization Group 14
Duality for Linear Optimization (LO) Let A : m n be a matrix, b R m and c, x R n. The primal Linear Optimization (LO) problem in standard form is given by Writing the problem as min x R n the Wolfe-dual becomes max y 1 0,y 2 0,s 0 (LO) min{c T x : Ax = b, x 0}. { c T x : Ax b 0, b Ax 0, x 0 } { c T x + y T 1(Ax b) + y T 2(b Ax) s T x : c + A T y 1 A T y 2 s = 0 } Replacing y 2 y 1 by y, we get { (c A T y s ) T x + b T y : c A T y s = 0}, max y,s 0 which yields the standard dual form of the linear optimization problem: max { b T y : A T y c }. Optimization Group 15
Convex quadratic optimization duality (P) min { c T x + 1 2 xt Qx : Ax b, x 0 }, Q psd. The constraints are linear and the object function is convex. Writing the problem as { c T x + 1 2 xt Qx : b Ax 0, x 0 } min x R n the Wolfe-dual becomes max { c T x + 1 2 xt Qx + y T (b Ax) s T x : c + Qx A T y s = 0, y 0, s 0 }. Substituting c = Qx + A T y + s in the objective, we get c T x + 1 2 xt Qx + y T (b Ax) s T x = ( Qx + A T y + s ) T x + 1 2 xt Qx + y T (b Ax) s T x Hence the problem simplifies to = ( Qx) T x + 1 2 xt Qx + y T b = b T y 1 2 xt Qx. max { b T y 1 2 xt Qx : c + Qx A T y s = 0, y 0, s 0 }. By eliminating s this becomes max { b T y 1 2 xt Qx : A T y Qx c, y 0 }. Optimization Group 16