subject to (x 2)(x 4) u,

Similar documents
5. Duality. Lagrangian

Homework Set #6 - Solutions

Convex Optimization M2

Convex Optimization Boyd & Vandenberghe. 5. Duality

Lagrange duality. The Lagrangian. We consider an optimization program of the form

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009

Lecture: Duality of LP, SOCP and SDP

Lecture: Duality.

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Convex Optimization & Lagrange Duality

CSCI : Optimization and Control of Networks. Review on Convex Optimization

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

EE/AA 578, Univ of Washington, Fall Duality

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

A Brief Review on Convex Optimization

Convex Optimization and SVM

ICS-E4030 Kernel Methods in Machine Learning

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

12. Interior-point methods

The Lagrangian L : R d R m R r R is an (easier to optimize) lower bound on the original problem:

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization M2

Constrained Optimization and Lagrangian Duality

EE364a Review Session 5

Convex Optimization and Modeling

Solution to EE 617 Mid-Term Exam, Fall November 2, 2017

Introduction to Mathematical Programming IE406. Lecture 10. Dr. Ted Ralphs

Convex Optimization and Support Vector Machine

Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities

Interior Point Algorithms for Constrained Convex Optimization

CS-E4830 Kernel Methods in Machine Learning

Lagrangian Duality Theory

HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers

minimize x subject to (x 2)(x 4) u,

On the Method of Lagrange Multipliers

Tutorial on Convex Optimization for Engineers Part II

4. Convex optimization problems

Generalization to inequality constrained problem. Maximize

Lagrangian Duality and Convex Optimization

Lecture 18: Optimization Programming

Optimization for Machine Learning

4. Convex optimization problems (part 1: general)

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

Support Vector Machines

8. Geometric problems

Applications of Linear Programming

Constrained Optimization

Optimisation convexe: performance, complexité et applications.

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Convex Optimization in Communications and Signal Processing

Linear and non-linear programming

Additional Homework Problems

12. Interior-point methods

Convex Optimization Lecture 6: KKT Conditions, and applications

EE 227A: Convex Optimization and Applications October 14, 2008

Lecture 7: Convex Optimizations

Optimality Conditions for Constrained Optimization

Convex Optimization. Lecture 12 - Equality Constrained Optimization. Instructor: Yuanzhang Xiao. Fall University of Hawaii at Manoa

A Tutorial on Convex Optimization II: Duality and Interior Point Methods

Gauge optimization and duality

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 6 Fall 2009

4TE3/6TE3. Algorithms for. Continuous Optimization

Subgradients. subgradients and quasigradients. subgradient calculus. optimality conditions via subgradients. directional derivatives

8. Geometric problems

Geometric problems. Chapter Projection on a set. The distance of a point x 0 R n to a closed set C R n, in the norm, is defined as

Tutorial on Convex Optimization: Part II

Lecture Note 5: Semidefinite Programming for Stability Analysis

Lecture: Convex Optimization Problems

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

CSCI 1951-G Optimization Methods in Finance Part 09: Interior Point Methods

Duality Theory of Constrained Optimization

Convex optimization problems. Optimization problem in standard form

COM S 578X: Optimization for Machine Learning

Chap 2. Optimality conditions

Finite Dimensional Optimization Part III: Convex Optimization 1

Subgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus

10-725/ Optimization Midterm Exam

Lagrangian Duality. Richard Lusby. Department of Management Engineering Technical University of Denmark

Enhanced Fritz John Optimality Conditions and Sensitivity Analysis

Midterm Review. Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A.

Nonlinear Programming 3rd Edition. Theoretical Solutions Manual Chapter 6

4. Convex optimization problems

Linear Programming. Larry Blume Cornell University, IHS Vienna and SFI. Summer 2016

More on Lagrange multipliers

10 Numerical methods for constrained problems

Primal/Dual Decomposition Methods

9. Dual decomposition and dual algorithms

Lecture 9 Sequential unconstrained minimization

Support Vector Machines for Classification and Regression

A : k n. Usually k > n otherwise easily the minimum is zero. Analytical solution:

Constrained optimization

Duality. Geoff Gordon & Ryan Tibshirani Optimization /

A : k n. Usually k > n otherwise easily the minimum is zero. Analytical solution:

Duality Uses and Correspondences. Ryan Tibshirani Convex Optimization

Two hours. To be provided by Examinations Office: Mathematical Formula Tables. THE UNIVERSITY OF MANCHESTER. xx xxxx 2017 xx:xx xx.

Transcription:

Exercises Basic definitions 5.1 A simple example. Consider the optimization problem with variable x R. minimize x 2 + 1 subject to (x 2)(x 4) 0, (a) Analysis of primal problem. Give the feasible set, the optimal value, and the optimal solution. (b) Lagrangian and dual function. Plot the objective x 2 + 1 versus x. On the same plot, show the feasible set, optimal point and value, and plot the Lagrangian L(x, λ) versus x for a few positive values of λ. Verify the lower bound property (p inf x L(x, λ) for λ 0). Derive and sketch the Lagrange dual function g. (c) Lagrange dual problem. State the dual problem, and verify that it is a concave maximization problem. Find the dual optimal value and dual optimal solution λ. Does strong duality hold? (d) Sensitivity analysis. Let p (u) denote the optimal value of the problem Solution. minimize x 2 + 1 subject to (x 2)(x 4) u, as a function of the parameter u. Plot p (u). Verify that dp (0)/du = λ. (a) The feasible set is the interval [2, 4]. The (unique) optimal point is x = 2, and the optimal value is p = 5. The plot shows f 0 and f 1. 30 25 20 f 0 15 10 PSfrag replacements 5 0 f 1 (b) The Lagrangian is 5 1 0 1 2 3 4 5 x L(x, λ) = (1 + λ)x 2 6λx + (1 + 8λ). The plot shows the Lagrangian L(x, λ) = f 0 + λf 1 as a function of x for different values of λ 0. Note that the minimum value of L(x, λ) over x (i.e., g(λ)) is always less than p. It increases as λ varies from 0 toward 2, reaches its maximum at λ = 2, and then decreases again as λ increases above 2. We have equality p = g(λ) for λ = 2.

5 Duality PSfrag replacements 30 25 20 15 10 f 0 + 3.0f 1 f 0 + 2.0f 1 f 0 + 1.0f 1 5 0 f0 5 1 0 1 2 3 4 5 x For λ > 1, the Lagrangian reaches its minimum at x = 3λ/(1 + λ). For λ 1 it is unbounded below. Thus 9λ 2 /(1 + λ) + 1 + 8λ λ > 1 g(λ) = λ 1 which is plotted below. 6 4 2 0 g(λ) 2 4 6 PSfrag replacements 8 10 2 1 0 1 2 3 4 λ We can verify that the dual function is concave, that its value is equal to p = 5 for λ = 2, and less than p for other values of λ. (c) The Lagrange dual problem is maximize 9λ 2 /(1 + λ) + 1 + 8λ subject to λ 0. The dual optimum occurs at λ = 2, with d = 5. So for this example we can directly observe that strong duality holds (as it must Slater s constraint qualification is satisfied). (d) The perturbed problem is infeasible for u < 1, since inf x(x 2 6x + 8) = 1. For u 1, the feasible set is the interval [3 1 + u, 3 + 1 + u], given by the two roots of x 2 6x + 8 = u. For 1 u 8 the optimum is x (u) = 3 1 + u. For u 8, the optimum is the unconstrained minimum of f 0,

i.e., x (u) = 0. In summary, p (u) = u < 1 11 + u 6 1 + u 1 u 8 1 u 8. The figure shows the optimal value function p (u) and its epigraph. 10 8 epi p 6 p (u) 4 PSfrag replacements 2 0 p (0) λ u 2 2 0 2 4 6 8 10 u Finally, we note that p (u) is a differentiable function of u, and that dp (0) du = 2 = λ. 5.2 Weak duality for unbounded and infeasible problems. The weak duality inequality, d p, clearly holds when d = or p =. Show that it holds in the other two cases as well: If p =, then we must have d =, and also, if d =, then we must have p =. Solution. (a) p =. The primal problem is unbounded, i.e., there exist feasible x with arbitrarily small values of f 0(x). This means that L(x, λ) = f 0(x) + m λ if i(x) is unbounded below for all λ 0, i.e., g(λ) = for λ 0. Therefore the dual problem is infeasible (d = ). (b) d =. The dual problem is unbounded above. This is only possible if the primal problem is infeasible. If it were feasible, with f i( x) 0 for i = 1,..., m, then for all λ 0, g(λ) = inf(f 0(x) + λ if i(x)) f 0( x) + λ if i( x), so the dual problem is bounded above. 5.3 Problems with one inequality constraint. Express the dual problem of i minimize c T x subject to f(x) 0, i

(c) Defining z = λw, we obtain the equivalent problem This is the dual of the original LP. maximize b T z subject to A T z + c = 0 z 0. 5.5 Dual of general LP. Find the dual function of the LP minimize subject to c T x Gx h Ax = b. Give the dual problem, and make the implicit equality constraints explicit. Solution. (a) The Lagrangian is L(x, λ, ν) = c T x + λ T (Gx h) + ν T (Ax b) = (c T + λ T G + ν T A)x hλ T ν T b, which is an affine function of x. It follows that the dual function is given by (b) The dual problem is λ T g(λ, ν) = inf L(x, λ, ν) = h ν T b c + G T λ + A T ν = 0 x otherwise. maximize g(λ, ν) subject to λ 0. After making the implicit constraints explicit, we obtain maximize λ T h ν T b subject to c + G T λ + A T ν = 0 λ 0. 5.6 Lower bounds in Chebyshev approximation from least-squares. Consider the Chebyshev or l -norm approximation problem minimize Ax b, (5.103) where A R m n and rank A = n. Let x ch denote an optimal solution (there may be multiple optimal solutions; x ch denotes one of them). The Chebyshev problem has no closed-form solution, but the corresponding least-squares problem does. Define x ls = argmin Ax b 2 = (A T A) 1 A T b. We address the following question. Suppose that for a particular A and b we have computed the least-squares solution x ls (but not x ch ). How suboptimal is x ls for the Chebyshev problem? In other words, how much larger is Ax ls b than Ax ch b? (a) Prove the lower bound using the fact that for all z R m, Ax ls b m Ax ch b, 1 m z 2 z z 2.

5 Duality (b) In example 5.6 (page 254) we derived a dual for the general norm approximation problem. Applying the results to the l -norm (and its dual norm, the l 1-norm), we can state the following dual for the Chebyshev approximation problem: maximize b T ν subject to ν 1 1 A T ν = 0. (5.104) Any feasible ν corresponds to a lower bound b T ν on Ax ch b. Denote the least-squares residual as r ls = b Ax ls. Assuming r ls 0, show that ˆν = r ls / r ls 1, ν = r ls / r ls 1, are both feasible in (5.104). By duality b T ˆν and b T ν are lower bounds on Ax ch b. Which is the better bound? How do these bounds compare with the bound derived in part (a)? Solution. (a) Simple manipulation yields Ax cheb b 1 m Ax cheb b 2 1 m Ax ls b 2 1 m Ax ls b. (b) From the expression x ls = (A T A) 1 A T b we note that A T r ls = A T (b A(A T A) 1 b) = A T b A T b = 0. Therefore A T ˆν = 0 and A T ν = 0. Obviously we also have ˆν 1 = 1 and ν 1 = 1, so ˆν and ν are dual feasible. We can write the dual objective value at ˆν as b T ˆν = bt r ls r ls 1 = (Ax ls b) T r ls r ls 1 = r ls 2 2 r ls 1 and, similarly, b T ν = r ls 2 2 r ls 1. Therefore ν gives a better bound than ˆν. Finally, to show that the resulting lower bound is better than the bound in part (a), we have to verify that r ls 2 2 1 r ls. r ls 1 m This follows from the inequalities which hold for general x R m. x 1 m x 2, x x 2 5.7 Piecewise-linear minimization. We consider the convex piecewise-linear minimization problem minimize max,...,m(a T i x + b i) (5.105) with variable x R n.

5.10 Optimal experiment design. The following problems arise in experiment design (see 7.5). (a) D-optimal design. minimize log det ( ) p 1 xivivt i subject to x 0, 1 T x = 1. (b) A-optimal design. minimize tr ( ) p 1 xivivt i subject to x 0, 1 T x = 1. p xivivt i The domain of both problems is x 0}. The variable is x R p ; the vectors v 1,..., v p R n are given. Derive dual problems by first introducing a new variable X S n and an equality constraint X = p xivivt i, and then applying Lagrange duality. Simplify the dual problems as much as you can. Solution. (a) D-optimal design. The Lagrangian is minimize log det(x 1 ) subject to X = p xivivt i x 0, 1 T x = 1. L(x, Z, z, ν) = log det(x 1 ) + tr(zx) = log det(x 1 ) + tr(zx) + p x ivi T Zv i z T x + ν(1 T x 1) p x i( vi T Zv i z i + ν) ν. The minimum over x i is bounded below only if ν vi T Zv i = z i. Setting the gradient with respect to X equal to zero gives X 1 = Z. We obtain the dual function log det Z + n ν ν v T g(z, z) = i Zv i = z i, i = 1,..., p otherwise. The dual problem is maximize log det Z + n ν subject to vi T Zv i ν, i = 1,..., p, with domain S n ++ R. We can eliminate ν by first making a change of variables W = (1/ν)Z, which gives maximize log det W + n + n log ν ν subject to vi T Ŵ v i 1, i = 1,..., p. Finally, we note that we can easily optimize n log ν ν over ν. The optimum is ν = n, and substituting gives maximize log det W + n log n subject to vi T W v i 1, i = 1,..., p.

5 Duality (b) A-optimal design. The Lagrangian is minimize tr(x 1 ) subject to X = ( p xivivt i x 0, 1 T x = 1. L(X, Z, z, ν) = tr(x 1 ) + tr(zx) = tr(x 1 ) + tr(zx) + ) 1 p x ivi T Zv i z T x + ν(1 T x 1) p x i( vi T Zv i z i + ν) ν. The minimum over x is unbounded below unless vi T Zv i + z i = ν. The minimum over X can be found by setting the gradient equal to zero: X 2 = Z, or X = Z 1/2 if Z 0, which gives 2 tr(z 1/2 ) Z 0 inf X 0 (tr(x 1 ) + tr(zx)) = otherwise. The dual function is g(z, z, ν) = The dual problem is ν + 2 tr(z 1/2 ) Z 0, v T i Zv i + z i = ν otherwise. maximize ν + 2 tr(z 1/2 ) subject to vi T Zv i nu, i = 1,..., p Z 0. As a first simplification, we define W = (1/ν)Z, and write the problem as By optimizing over ν > 0, we obtain 5.11 Derive a dual problem for maximize ν + 2 ν tr(w 1/2 ) subject to vi T W v i 1, i = 1,..., p W 0. maximize (tr(w 1/2 )) 2 subject to vi T W v i 1, i = 1,..., p W 0. minimize N Aix + bi 2 + (1/2) x x0 2 2. The problem data are A i R m i n, b i R m i, and x 0 R n. First introduce new variables y i R m i and equality constraints y i = A ix + b i. Solution. The Lagrangian is L(x, z 1,..., z N ) = N y i 2 + 1 N 2 x x0 2 2 zi T (y i A ix b i).

5 Duality 5.13 Lagrangian relaxation of Boolean LP. A Boolean linear program is an optimization problem of the form minimize c T x subject to Ax b x i 0, 1}, i = 1,..., n, and is, in general, very difficult to solve. In exercise 4.15 we studied the LP relaxation of this problem, minimize c T x subject to Ax b (5.107) 0 x i 1, i = 1,..., n, which is far easier to solve, and gives a lower bound on the optimal value of the Boolean LP. In this problem we derive another lower bound for the Boolean LP, and work out the relation between the two lower bounds. (a) Lagrangian relaxation. The Boolean LP can be reformulated as the problem minimize subject to c T x Ax b x i(1 x i) = 0, i = 1,..., n, which has quadratic equality constraints. Find the Lagrange dual of this problem. The optimal value of the dual problem (which is convex) gives a lower bound on the optimal value of the Boolean LP. This method of finding a lower bound on the optimal value is called Lagrangian relaxation. (b) Show that the lower bound obtained via Lagrangian relaxation, and via the LP relaxation (5.107), are the same. Hint. Derive the dual of the LP relaxation (5.107). Solution. (a) The Lagrangian is L(x, µ, ν) = c T x + µ T (Ax b) ν T x + x T diag(ν)x = x T diag(ν)x + (c + A T µ ν) T x b T µ. Minimizing over x gives the dual function b T µ (1/4) n g(µ, ν) = (ci + at i µ ν i) 2 /ν i ν 0 otherwise where a i is the ith column of A, and we adopt the convention that a 2 /0 = if a 0, and a 2 /0 = 0 if a = 0. The resulting dual problem is sup ν i 0 maximize b T µ (1/4) n (ci + at i µ ν i) 2 /ν i subject to ν 0. In order to simplify this dual, we optimize analytically over ν, by noting that ( (ci + ) at i µ ν i) 2 (ci + a T i µ) c i + a T i µ 0 = ν i 0 c i + a T i µ 0 = min0, (c i + a T i µ)}. This allows us to eliminate ν from the dual problem, and simplify it as maximize b T µ + n min0, ci + at i µ} subject to µ 0.

(b) We follow the hint. The Lagrangian and dual function of the LP relaxation re The dual problem is L(x, u, v, w) = c T x + u T (Ax b) v T x + w T (x 1) = (c + A T u v + w) T x b T u 1 T w b T u 1 T w A T u v + w + c = 0 g(u, v, w) = otherwise. maximize b T u 1 T w subject to A T u v + w + c = 0 u 0, v 0, w 0, which is equivalent to the Lagrange relaxation problem derived above. We conclude that the two relaxations give the same value. 5.14 A penalty method for equality constraints. We consider the problem minimize f 0(x) subject to Ax = b, (5.108) where f 0 : R n R is convex and differentiable, and A R m n with rank A = m. In a quadratic penalty method, we form an auxiliary function φ(x) = f(x) + α Ax b 2 2, where α > 0 is a parameter. This auxiliary function consists of the objective plus the penalty term α Ax b 2 2. The idea is that a minimizer of the auxiliary function, x, should be an approximate solution of the original problem. Intuition suggests that the larger the penalty weight α, the better the approximation x to a solution of the original problem. Suppose x is a minimizer of φ. Show how to find, from x, a dual feasible point for (5.108). Find the corresponding lower bound on the optimal value of (5.108). Solution. If x minimizes φ, then Therefore x is also a minimizer of f 0( x) + 2αA T (A x b) = 0. f 0(x) + ν T (Ax b) where ν = 2α(A x b). Therefore ν is dual feasible with g(ν) = inf x (f0(x) + νt (Ax b)) = f 0( x) + 2α A x b 2 2. Therefore, for all x that satisfy Ax = b. 5.15 Consider the problem f 0(x) f 0( x) + 2α A x b 2 2 minimize f 0(x) subject to f i(x) 0, i = 1,..., m, (5.109)

(c) with variables x, u, t, v. minimize subject to x T Σx p T x r min 1 T x = 1, x 0 n/20 t + 1 T u 0.9 λ1 + u 0 u 0, 5.20 Dual of channel capacity problem. Derive a dual for the problem minimize subject to c T x + m P x = y x 0, 1 T x = 1, yi log yi where P R m n has nonnegative elements, and its columns add up to one (i.e., P T 1 = 1). The variables are x R n, y R m. (For c j = m pij log pij, the optimal value is, up to a factor log 2, the negative of the capacity of a discrete memoryless channel with channel transition probability matrix P ; see exercise 4.57.) Simplify the dual problem as much as possible. Solution. The Lagrangian is m L(x, y, λ, ν, z) = c T x + y i log y i λ T x + ν(1 T x 1) + z T (P x y) = ( c λ + ν1 + P T z) T x + The minimum over x is bounded below if and only if m y i log y i z T y ν. c λ + ν1 + P T z = 0. To minimize over y, we set the derivative with respect to y i equal to zero, which gives log y i + 1 z i = 0, and conclude that The dual function is g(λ, ν, z) = The dual problem is inf y i 0 (yi log yi ziyi) = ez i 1. m ez i 1 ν c λ + ν1 + P T z = 0 otherwise. maximize m exp(zi 1) ν subject to P T z c + ν1 0. This can be simplified by introducing a variable w = z + ν1 (and using the fact that 1 = P T 1), which gives maximize m exp(wi ν 1) ν subject to P T w c. Finally we can easily maximize the objective function over ν by setting the derivative equal to zero (the optimal value is ν = log( i e1 w i ), which leads to maximize log( m exp wi) 1 subject to P T w c. This is a geometric program, in convex form, with linear inequality constraints (i.e., monomial inequality constraints in the associated geometric program).

There are four solutions: corresponding to ν = 3.15, ν = 0.22, ν = 1.89, ν = 4.04, x = (0.16, 0.47, 0.87), x = (0.36, 0.82, 0.45), x = (0.90, 0.35, 0.26), x = ( 0.97, 0.20, 0.17). (c) ν is the largest of the four values: ν = 4.0352. This can be seen several ways. The simplest way is to compare the objective values of the four solutions x, which are f 0(x) = 1.17, f 0(x) = 0.67, f 0(x) = 0.56, f 0(x) = 4.70. We can also evaluate the dual objective at the four candidate values for ν. Finally we can note that we must have 2 f 0(x ) + ν 2 f 1 (x ) 0, because x is a minimizer of L(x, ν ). In other words [ ] [ 3 0 0 1 0 0 0 1 0 + ν 0 1 0 0 0 2 0 0 1 and therefore ν 3. 5.30 Derive the KKT conditions for the problem minimize tr X log det X subject to Xs = y, with variable X S n and domain S n ++. y R n and s R n are given, with s T y = 1. Verify that the optimal solution is given by X = I + yy T 1 s T s sst. Solution. We introduce a Lagrange multiplier z R n for the equality constraint. The KKT optimality conditions are: ] 0, X 0, Xs = y, X 1 = I + 1 2 (zst + sz T ). (5.30.A) We first determine z from the condition Xs = y. Multiplying the gradient equation on the right with y gives s = X 1 y = y + 1 2 (z + (zt y)s). (5.30.B) By taking the inner product with y on both sides and simplifying, we get z T y = 1 y T y. Substituting in (5.30.B) we get z = 2y + (1 + y T y)s, and substituting this expression for z in (5.30.A) gives X 1 = I + 1 2 ( 2ysT 2sy T + 2(1 + y T y)ss T ) = I + (1 + y T y)ss T ys T sy T.

5 Duality Finally we verify that this is the inverse of the matrix X given above: ( I + (1 + y T y)ss T ys T sy T ) X = (I + yy T (1/s T s)ss T ) + (1 + y T y)(ss T + sy T ss T ) (ys T + yy T ys T ) (sy T + (y T y)sy T (1/s T s)ss T ) = I. To complete the solution, we prove that X 0. An easy way to see this is to note that ( ) ( ) T X = I + yy T sst s T s = I + yst sst I + yst sst. s 2 s T s s 2 s T s 5.31 Supporting hyperplane interpretation of KKT conditions. Consider a convex problem with no equality constraints, minimize f 0(x) subject to f i(x) 0, i = 1,..., m. Assume that x R n and λ R m satisfy the KKT conditions Show that f i(x ) 0, i = 1,..., m λ i 0, i = 1,..., m λ i f i(x ) = 0, i = 1,..., m f 0(x ) + m λ i f i(x ) = 0. f 0(x ) T (x x ) 0 for all feasible x. In other words the KKT conditions imply the simple optimality criterion of 4.2.3. Solution. Suppose x is feasible. Since f i are convex and f i(x) 0 we have Using λ i 0, we conclude that 0 f i(x) f i(x ) + f i(x ) T (x x ), i = 1,..., m. 0 = m λ i ( fi(x ) + f i(x ) T (x x ) ) m m λ i f i(x ) + λ i f i(x ) T (x x ) = f 0(x ) T (x x ). In the last line, we use the complementary slackness condition λ i f i(x ) = 0, and the last KKT condition. This shows that f 0(x ) T (x x ) 0, i.e., f 0(x ) defines a supporting hyperplane to the feasible set at x. Perturbation and sensitivity analysis 5.32 Optimal value of perturbed problem. Let f 0, f 1,..., f m : R n R be convex. Show that the function p (u, v) = inff 0(x) x D, f i(x) u i, i = 1,..., m, Ax b = v}