UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009

Similar documents
subject to (x 2)(x 4) u,

5. Duality. Lagrangian

Convex Optimization M2

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Convex Optimization Boyd & Vandenberghe. 5. Duality

Lecture: Duality of LP, SOCP and SDP

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers

Homework Set #6 - Solutions

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

8. Geometric problems

Lecture: Duality.

ICS-E4030 Kernel Methods in Machine Learning

Lecture 18: Optimization Programming

Convex Optimization and SVM

8. Geometric problems

ECON 5111 Mathematical Economics

minimize x subject to (x 2)(x 4) u,

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

Convex Optimization & Lagrange Duality

Additional Homework Problems

10 Numerical methods for constrained problems

CS-E4830 Kernel Methods in Machine Learning

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

Linear Algebra Massoud Malek

Support Vector Machines

Generalization to inequality constrained problem. Maximize

Applications of Linear Programming

MATH 4211/6211 Optimization Constrained Optimization

4. Algebra and Duality

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

Lecture 4: Optimization. Maximizing a function of a single variable

Constrained Optimization

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

Optimality Conditions for Constrained Optimization

Review of Some Concepts from Linear Algebra: Part 2

MATH2070 Optimisation

2. Dual space is essential for the concept of gradient which, in turn, leads to the variational analysis of Lagrange multipliers.

2.098/6.255/ Optimization Methods Practice True/False Questions

Geometric problems. Chapter Projection on a set. The distance of a point x 0 R n to a closed set C R n, in the norm, is defined as

CSCI : Optimization and Control of Networks. Review on Convex Optimization

10-725/ Optimization Midterm Exam

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

3. Vector spaces 3.1 Linear dependence and independence 3.2 Basis and dimension. 5. Extreme points and basic feasible solutions

Nonlinear Optimization: What s important?

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Math 164-1: Optimization Instructor: Alpár R. Mészáros

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

More on Lagrange multipliers

The Lagrangian L : R d R m R r R is an (easier to optimize) lower bound on the original problem:

Lecture Note 5: Semidefinite Programming for Stability Analysis

Linear and non-linear programming

Optimization for Machine Learning

EE/AA 578, Univ of Washington, Fall Duality

Lagrangian Duality and Convex Optimization

On the Method of Lagrange Multipliers

Nonlinear Programming (Hillier, Lieberman Chapter 13) CHEM-E7155 Production Planning and Control

Knowledge Discovery and Data Mining 1 (VO) ( )

Constrained Optimization and Lagrangian Duality

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

Lecture 7: Convex Optimizations

EE364a Review Session 5

Two hours. To be provided by Examinations Office: Mathematical Formula Tables. THE UNIVERSITY OF MANCHESTER. xx xxxx 2017 xx:xx xx.

Solving Dual Problems

Written Examination

Algorithms for nonlinear programming problems II

12. Interior-point methods

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

5 Handling Constraints

Constrained optimization

Numerical Optimization

LECTURE 10 LECTURE OUTLINE

March 8, 2010 MATH 408 FINAL EXAM SAMPLE

CE 191: Civil & Environmental Engineering Systems Analysis. LEC 17 : Final Review

3 The Simplex Method. 3.1 Basic Solutions

Constrained optimization: direct methods (cont.)

Mathematical Economics. Lecture Notes (in extracts)

4TE3/6TE3. Algorithms for. Continuous Optimization

Support Vector Machines

A Brief Review on Convex Optimization

Assignment 1: From the Definition of Convexity to Helley Theorem

Support Vector Machines

8. Constrained Optimization

HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.

Rank-one LMIs and Lyapunov's Inequality. Gjerrit Meinsma 4. Abstract. We describe a new proof of the well-known Lyapunov's matrix inequality about

Paul Schrimpf. October 17, UBC Economics 526. Constrained optimization. Paul Schrimpf. First order conditions. Second order conditions

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

Additional Exercises for Introduction to Nonlinear Optimization Amir Beck March 16, 2017

MATH 240 Spring, Chapter 1: Linear Equations and Matrices

Introduction and Math Preliminaries

Appendix PRELIMINARIES 1. THEOREMS OF ALTERNATIVES FOR SYSTEMS OF LINEAR CONSTRAINTS

2. Matrix Algebra and Random Vectors

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

Inequality Constraints

10725/36725 Optimization Homework 2 Solutions

WHY DUALITY? Gradient descent Newton s method Quasi-newton Conjugate gradients. No constraints. Non-differentiable ???? Constrained problems? ????

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725

ELE539A: Optimization of Communication Systems Lecture 6: Quadratic Programming, Geometric Programming, and Applications

Math Advanced Calculus II

Transcription:

UC Berkeley Department of Electrical Engineering and Computer Science EECS 227A Nonlinear and Convex Optimization Solutions 5 Fall 2009 Reading: Boyd and Vandenberghe, Chapter 5 Solution 5.1 Note that we have f(x) = 1 for any x R n. Since for any feasible point h(x) = 2x 0, any local minimum will be regular. From the Lagrange multiplier theorem, there exists λ such that f(x ) + λ h(x ) = 0, or equivalently 1 + 2λ x = 0. Using the requirement that h(x ) = x 2 1 = 0, this yields the two possibilities (x i, λ i ) = ± ( 1 n ) n,. 2 By using the second order optimality conditions, we have y ( 2 f(x ) + λ 2 h(x ) ) y 0 for all y V (x ) : = {y 2(x ) T y = 0}. Since 2 f(x ) = 0 and 2 h(x ) = 2I, the above condition is equivalent to 2λ y T Iy 0 for all y V (x ), which can hold only for non-negative λ. Hence the only point satisfying the first and second order conditions for a local minimum is x = 1 n 1. Since the constraint set is compact and f is continuous, an optimal point x exists, and the above point must be the optimum. One geometric interpretation of the Lagrange multiplier conditions is that the gradient direction f(x ) = 1 is parallel with the gradient h(x ) = 2x associated with the constraint h(x) = x 2 1 = 0, which defines the boundary of a circle. Solution 5.2 (a) Note that the constraint set for k th problem is C k = {x R n : x 2 = 1, e T i x = 0, i = 1,..., k} satisfies C 1 C 2... C n. Since all these problems have the same objective, we must have the optimal value satisfies λ 1 λ 2... λ n (b) Suppose α 1,..., α n satisfies n α ie i = 0. By multiplying both sides with e T k, note that e T k e i = 0, i k and e T k e k = 1, we have α k = 0, k = 1,..., n. Therefore, e 1,..., e n are linearly independent. (c) First consider problem P 1. Since e 1 0, the optimum is regular, so that the Lagrange multiplier theorem guarantees the existence of some µ 1,1 R such that 2Qe 1 + 2µ 1,1e 1 = 0. 1

Taking inner products of both sides of this equation with e 1 and using e 1 2 = 1 yields that λ 1 = µ 1,1 is a Lagrange multiplier, and (λ 1, e 1 ) are an eigenvalue/vector pair. Now from the linear independence established in (b), the points e 2,... e n are regular optima for problems P 2, P 3... P n respectively. So by our Lagrange multiplier condition, for i = 2,..., n, there exists a unique µ i = {µ i,1, µ i,2,..., µ i,i } Ri for problem P i such that 2Qe i + 2µ i,ie i + [ i 1 µ ] i,je j = 0. (1) j=1 Take inner products of both sides with e i and use the orthogonality condition to obtain 2e T i Qe i + 2µ i,i e i 2 = 2λ i + 2µ i,i = 0. Thus, we can identify λ i = µ i,i as the Lagrange multiplier for P i associated with the constraint x 2 1 = 0. We have already shown that (λ 1, e 1 ) are an eigenvalue/vector pair for Q. We now proceed by induction on i: suppose that for j = 1,..., i 1, the corresponding (λ j, e j ) are eigenvalue/vector pairs for Q. Consider equation (1) for P i. Taking inner products with e j for j = 1,..., i 1, using the orthogonality relations and the fact that (by the induction hypothesis) e j is an eigenvector with eigenvalue λ j yields 2e T j Qe i + µ i,j e j 2 = 2e T j λ j e i + µ i,j = µ i,j = 0. Thus, substituting µ i,i = λ i and µ i,j = 0 for all j = 1,..., i 1 into equation (1) yields that 2Qe i = 2λe i, so that (λ i, e i ) are another eigenvalue/vector pair. Solution 5.3 (a) Translating the condition for regularity to linear constraints, the row vectors a T 1,..., at m forming A be linearly independent. If rank(a) < m, this is not posssible. (b) If A is rank-deficient (say rank(a) = k < m), then we can form a reduced matrix of the form B =... a T i k where {i 1,..., i k } are a subset of k indices from {1,..., m}, such that B has rank k, and imposes the same set of constraints as A. Thus, the original problem is equivalent to min f(x) subject to Bx = 0. Any optimum for the reduced problem will be regular by construction, so that the Lagrange multiplier theorem yields that there exist scalars λ i j R for j = 1,..., k such that f(x ) + 2 a T i 1 k λ i j a ij = 0. j=1

We can now add Lagrange multipliers λ l = 0 for any index l not included in {i 1,..., i k }, which leads to the Lagrange multiplier condition for the original problem. (c) We lose uniqueness of the Lagrange multiplier, because there is freedom in which subset of k linearly independent rows that we choose to form the reduced problem. Solution 5.4 The given problem is equivalent to minimizing y T x subject to g(x) = x T Qx 1 0. The constraint set is closed and bounded, and the cost function is continuous, so that a global minimum exists. If y = 0, then f(x) = 0 for all x in the constraint set, and the result follows. Otherwise, for y 0, we have f(x) = y, g(x) = 2Qx. Since Q 0, any feasible point is regular. Assume that x is a local minimum. By the KKT conditions, there exists a µ 0 such that y + µ 2Qx = 0, µ = 0 if (x ) T Qx < 1. Since y 0, we must have µ > 0 and thus (x ) T Qx = 1, and y = 2µ Qx. Multiplying both sides of this latter equation by (x ) T yields But we also have (x ) T y = 2µ (x ) T Qx = 2µ. x = Q 1 y 2µ = Q 1 y y T x Take inner products of both sides with y T to obtain y T x = yt Q 1 y y T x which is equivalent to y T x = ± y T Q 1 y. Therefore, the optimal minimum is y T Q 1 y, and so the optimal value of the original maximization problem is y T Q 1 y. Now for any x 0, let x = x/ x T Qx. We have x T Q x = 1 so that x is feasible for the maximization problem and y T x = yt x x T Qx y T Q 1 y as required. Solution 5.5 (a) Simple manipulation yeilds Ax cheb b 1 m Ax cheb b 2 1 m Ax ls b 2 1 m Ax ls b 3

(b) From the expression x ls = (A T A) 1 A T b we note that A T r ls = A T (b A(A T A) 1 b) = A T b A T b = 0 therefore A T ˆν = 0 and A T ν = 0. Obviously we also have ˆν 1 = ν 1 = 1 so ˆν and ν are dual feasible. We can write the dual objective value at ˆν as and similarly, b T ˆν = bt r ls = (Ax ls b) T r ls = r ls 2 2 b T = r ls 2 2 Therefore, ν gives a better bound than ˆν Finally to show that the resulting lower bound is better than the bound in part (a), we have to verify that This follows from the inequalities which hold for general x R m. r ls 2 2 1 r ls m x 1 m x 2, x x 2 Solution 5.6 The Lagrangian is L(x, z 1,..., z N ) = y i 2 + 1 2 x x 0 2 2 We first minimize over y i. We have inf y i y i 2 + z T i y i = zi T (y i A i x b i ) { 0 if zi 2 1 otherwise (If z i 2 > 1, choose y i = tz i and let t, to show that the function is unbounded below. If z i 2 1, it follows from the Cauchy-Schwarz inequality that y i 2 + zi T y i 0, so the minimum is reached when y i = 0). We can minimize over x by setting the gradient with respect to x equal to zero. This yields x = x 0 + A T i z Substituting in the Lagrangian gives the dual function { N g(z 1,..., z N ) = (A ix 0 + b i ) T z i 1 2 N AT i z i 2 2 if z i 2 1, i = 1,..., N otherwise 4

Therefore the dual problem is minimize subject to N (A ix 0 + b i ) T z i 1 2 N AT i z i 2 2 z i 2 1, i = 1,..., N Solution 5.7 If x minimizes φ, then Therefore x is also a minimizer of f 0 ( x) + 2αA T (A x b) = 0 f 0 (x) + ν T (Ax b) where ν = 2α(A x b). Therefore ν is dual feasible with g(ν) = inf x f 0 (x) + ν T (Ax b) = f 0 ( x) + 2α A x b 2 2 Therefore, for all x such that Ax = b. f 0 (x) f 0 ( x) + 2α A x b 2 2 5