# Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Size: px
Start display at page:

## Transcription

1 Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The only potential problem is that, if you read it sequentially, you have to go through almost 300 pages to get through duality theory. It turns out that a well-chosen 10 pages are enough for a self-contained introduction to the topic. Most of the text here is copied essentially verbatim from the original. 1 Notation Use notation f : R p R q to mean that f maps from some subset of R p, namely dom f R p, where dom f stands for the domain of the function f R are the real numbers R + are nonnegative reals R ++ are positive reals a b for a, b R d means component-wise inequality i.e. a i b i for i {1,..., d} 2 Affine and Convex Sets (BV 2.1) 2.1 Affine Sets Intuitively, an affine set is any point, line, plane, or hyperplane. But let s make this more precise. Definition 1. A set C R n is affine if the line through any two distinct points in C lies in C. That is, if for any x 1, x 2 C and θ R, we have θx 1 + (1 θ)x 2 C. Recall that a subspace is a subset of a vector space that is closed under sums and scalar multiplication. If C is an affine set and x 0 C, then the set V = C x 0 = {x x 0 x C} is a subspace. Thus, we can also write an affine set as C = V + x 0 = {v + x 0 v V }, i.e. as a subspace plus an offset. The subspace V associated with the affine set C does not depend on the choice of x 0 C. Thus we can make the following definition: Definition 2. The dimension of an affine set C is the dimension of the subspace V = C x 0, where x 0 is any element of C. 1

2 2 2 Affine and Convex Sets (BV 2.1) We note that the solution set of a system of linear equations is an affine set, and every affine set can be expressed as the solution of a system of linear equations [BV Example 2.1, p. 22]. Definition 3. A hyperplane in R n is a a set of the form {x a T x = b}, for a R n, a 0, b R, and where a is the normal vector to the hyperplane. Note that a hyperplane in R n is an affine set of dimension n Convex Sets (BV 2.1.4) Definition 4. A set C is convex if the line segment between any two points in C lies in C. That is, if for any x 1, x 2 C and any θ with 0 θ 1 we have Every affine set is also convex. θx 1 + (1 θ)x 2 C. 2.3 Spans and Hulls Given a set of points x 1,... x k R n, there are various types of linear combinations that we can take: A linear combination is a point of the form θ 1 x 1 + +θ k x k, with no constraints on θ i s. The span of x 1,..., x k is the set of all linear combinations of x 1,..., x k. An affine combination is a point of the form θ 1 x 1 + +θ k x k, where θ 1 + +θ k = 1. The affine hull of x 1,..., x k, denoted aff (x 1,..., x k ), is the set of all affine combinations of x 1,..., x k. A convex combination is a point of the form θ 1 x 1 + +θ k x k, where θ 1 + +θ k = 1 and θ i 0 for all i. The convex hull of x 1,..., x k is the set of all convex combinations of x 1,..., x k.

3 3 3 Convex Functions 3.1 Definitions (BV 3.1, p. 67) Definition 5. A function f : R n R is convex if dom f is a convex set and if for all x, y dom f, and 0 θ 1, we have A function f is concave if f is convex. f(θx + (1 θ)y) θf(x) + (1 θ)f(y). Geometrically, a function is convex if the line segment connecting any two points on the graph of f lies above the graph: Definition 6. A function f is strictly convex if when we additionally restrict x y and 0 < θ < 1, then we get strict inequality: f(θx + (1 θ)y) < θf(x) + (1 θ)f(y). Definition 7. A function f is strongly convex if µ > 0 such that x f(x) µ x 2 is convex. The largest possible µ is called the strong convexity constant.

4 4 3 Convex Functions Consequences for Optimization convex: strictly convex: strongly convex: if there is a local minimum, then it is a global minimum if there is a local minimum, then it is the unique global minimum there exists a unique global minimum First-order conditions (BV 3.1.3) The following characterization of convex functions is possibly obvious from the picture, but we highlight it here because later it forms the basis for the definition of the subgradient, which generalizes the gradient to nondifferentiable functions. Suppose f : R n R is differentiable (i.e. dom f is open and f exists at each point indom f). Then f is convex if and only if dom f is convex and f(y) f(x) + f(x) T (y x) holds for all x, y dom f. In other words, for a convex differentiable function, the linear approximation to f at x is a global underestimator of f: The inequality shows that from local information about a convex function (i.e. its value and derivative at a point) we can derive global information (i.e. a global underestimator of it). This is perhaps the most important property of convex functions. For example, the inequality shows that if f(x) = 0, then for all y dom f, f(y) f(x), i.e. x is a global minimizer of f Examples of Convex Functions (BV 3.1.5) Functions mapping from R: x e ax is convex on R for all a R x x a is convex on R ++ when a 1 or a 0 and concave for 0 a 1 x p for p 1 is convex on R log x is concave on R ++ x log x (either on R ++ or on R + if we define 0 log 0 = 0) is convex Functions mapping from R n :

5 3.2 Operations the preserve convexity (Section 3.2, p. 79) 5 Every norm on R n is convex Max: (x 1,..., x n ) max {x 1..., x n } is convex on R n Log-Sum-Exp 1 : (x 1,..., x n ) log (e x e xn ) is convex on R n. 3.2 Operations the preserve convexity (Section 3.2, p. 79) Nonnegative weighted sums If f 1,..., f m are convex and w 1,..., w m 0, then f = w 1 f w m f m is convex. More generally, if f(x, y) is convex in x for each y A, and if w(y) 0 for each y A, then the function g(x) = w(y)f(x, y) dy is convex in x (provided the integral exists). A Composition with an affine mapping A function f : R n R m is an affine function (or affine mapping) if it is a sum of a linear function and a constant. That is, if it has the form f(x) = Ax + b, where A R m n and b R m. Composition of a convex function with an affine function is convex. More precisely: suppose f : R n R, A R n m and b R n. Define g : R m R by g(x) = f (Ax + b), with dom g = {x Ax + b dom f}. Then if f is convex, then so is g; if f is concave, so is g. If f is strictly convex, and A has linearly independent columns, then g is also strictly convex Simple Composition Rules If g is convex then exp g(x) is convex. If g is convex and nonnegative and p 1 then g(x) p is convex. If g is concave and positive then log g(x) is concave If g is concave and positive then 1/g(x) is convex. 1 This function can be interpreted as a differentiable (in fact, analytic) approximation to the max function, since max {x 1,..., x n } log (e x1 + + e xn ) max {x 1,..., x n } + log n. Can you prove it? Hint: max(a, b) a + b 2 max(a, b).

6 6 4 Optimization Problems (BV Chapter 4) Maximum of convex functions is convex (BV Section 3.2.3, p. 80) Note: Below we use this to prove that the Lagrangian dual function is concave. If f 1,..., f m : R n R are convex, then their pointwise maximum f(x) = max {f 1 (x),..., f m (x)} is also convex with domain dom f = dom f 1 dom f m. This result extends to the supremum over arbitrary sets of functions (including uncountably infinite sets). 4 Optimization Problems (BV Chapter 4) 4.1 General Optimization Problems (BV Section 4.1.1) The standard form for an optimization problem is the following: minimize subject to f 0 (x) f i (x) 0, i = 1,..., m h i (x) = 0, i = 1,... p, where x R n are called the optimization variables. The function f 0 : R n R is the objective function (or cost function); the inequalities f i (x) 0 are called inequality constraints and the corresponding functions f i : R n R are called the inequality constraint functions. The equations h i (x) = 0 are called the equality constraints and the functions h i : R n R are the equality constraint functions. If there are no constraints (i.e. m = p = 0), we say the problem is unconstrained. The set of points for which the objective and all constraint functions are defined, m p D = dom f i dom h i, i=0 is called the domain of the optimization problem. A point x D is feasible if it satisfies all the equality and inequality constraints. The set of all feasible points is called the feasible set or the constraint set. If x is feasible and f i (x) = 0, then we say the ith inequality constraint f i (x) 0 is active at x. The optimal value p of the problem is defined as p = inf {f 0 (x) f i (x) 0, i = 1,..., m, h i (x) = 0, i = 1,..., p}. Note that if the problem is infeasible, p =, since it is the inf of an empty set. We say that x is an optimal point (or is a solution to the problem) if x is feasible and f(x ) = p. The set of optimal points is the optimal set. We say that a feasible point x is locally optimal if there is an R > 0 such that x solves the following optimization problem: minimize subject to f 0 (z) f i (z) 0, i = 1,..., m h i (z) = 0, i = 1,... p z x 2 R with optimization variable z. Roughly speaking, this means x minimizes f 0 over nearby points in the feasible set.

7 4.2 Convex Optimization Problems (Section 4.2, p. 136) Convex Optimization Problems (Section 4.2, p. 136) Convex optimization problems in standard form (Section 4.2.1) The standard form for a convex optimization problem is the following: minimize subject to f 0 (x) f i (x) 0, i = 1,..., m a T i x = b i, i = 1,... p where f 0,..., f m are convex functions. Compared with the general standard form, the convex problem has three additional requirements: the objective function must be convex the inequality constraint functions must be convex the equality constraints functions must be affine We immediately note an important property: the feasible set of a convex optimization problem is convex (see BV p. 137) Local and global Optima (4.2.2, p. 138) Fact 8. A fundamental property of convex optimization problems is that any locally optimal point is also globally optimal. 5 Duality (BV Chapter 5) 5.1 The Lagrangian (BV Section 5.1.1) We again consider the general optimization problem in standard form: minimize subject to f 0 (x) f i (x) 0, i = 1,..., m h i (x) = 0, i = 1,... p, with variable x R n. We assume its domain D = m i=0 dom f i p dom h i is nonempty and denote the optimal value by p. We do not assume the problem is convex. Definition 9. The Lagrangian L : R n R m R p R for the general optimization problem defined above is L(x, λ, ν) = f 0 (x) + λ i f i (x) + I=1 ν i h i (x), with dom L = D R m R p. We refer to the λ i as the Lagrange multiplier associated with the ith inequality constraint and ν i as the Lagrange multiplier associated with the ith equality constraint. The vectors λ and ν are called the dual variables or Lagrange multiplier vectors.

8 8 5 Duality (BV Chapter 5) Max-min characterization of weak and strong duality (BV Section 5.4.1) Note that ( ) sup L(x, λ, ν) = sup f 0 (x) + λ i f i (x) + ν i h i (x), λ 0,ν λ 0,ν { f 0 (x) f i (x) 0 i = 1,..., m and h i (x) = 0 i = 1,... p = otherwise. In words, when x is in the feasible set, we get back the objective function: sup λ 0 L(x, λ) = f 0 (x). Otherwise, we get. Proof: Suppose x violates an inequality constraint, say f i (x) > 0. Then sup λ 0 L(x, λ) =, which we can see by taking λ j = 0 for j i, taking all ν i = 0, and sending λ i. We can make a similar argument for an equality constraint violation. If x is feasible, then f i (x) 0 and h i (x) = 0 for all i, and thus the supremum is achieved by taking λ = 0, which yields sup λ 0,ν L(x, λ) = f 0 (x). It should now be clear that we can write the original optimization problem as p = inf x sup L(x, λ, ν). λ 0,ν In this context, this optimization problem is called the primal problem. We get the Lagrange dual problem by swapping the inf and the sup: d = sup inf L(x, λ, ν), λ 0,ν x where d is the optimal value of the Lagrange dual problem. Theorem 10 (Weak max-min inequality, BV Exercise 5.24, p. 281). For any f : R n R m R, W R n, or Z R m, we have sup z Z inf w W f(w, z) inf sup w W z Z Proof. For any w 0 W and z 0 Z, we clearly have f(w, z). inf f(w, z 0) sup f(w 0, z). w W z Z Since this is true for all w 0 and z 0, we must also have sup inf f(w, z 0) inf sup f(w 0, z). z 0 Z w W w 0 W z Z In the context of an optimization problem, the weak max-min inequality is called weak duality, which always holds for any optimization problem ( not just convex ): p = inf x sup λ 0,ν sup inf λ 0,ν x [ [ f 0 (x) + f 0 (x) + λ i f i (x) + I=1 λ i f i (x) + I=1 ] ν i h i (x) ] ν i h i (x) = d The gap p d is called the duality gap. For convex optimization problems, we very often have strong duality, which is when we have the equality: p = d.

9 5.2 Strong duality and Slater s constraint qualification (5.2.3, p. 226) The Lagrange Dual Function (BV Section p. 216) We define the Lagrange dual function (or just dual function) g : R m R p R as the minimum value of the Lagrangian over x: for λ R m, ν R p, ( ) g(λ, ν) = inf L(x, λ, ν) = inf f 0 (x) + λ i f i (x) + ν i h i (x). x D x D This is the inner minimization problem of the Lagrange dual problem discussed above. When the Lagrangian is unbounded below in x, the dual function takes on the value. The dual function is concave even when the optimization problem is not convex, since the dual function is the pointwise infimum of a family of affine functions of (λ, ν) (a different affine function for each x D) The Lagrange Dual Problem From weak duality, it is clear than for each pair (λ, ν) with λ 0, the Lagrange dual function g(λ, ν) gives us a lower bound on p. A search for the best possible lower bound is one motivation for the Lagrange dual problem, which now we can write as I=1 maximize g(λ, ν) subject to λ 0. In this context, a pair (λ, ν) is called dual feasible is λ 0 and g(λ, ν) >. We refer to (λ, ν ) as dual optimal or optimal Lagrange multipliers if they are optimal for the Lagrange dual problem. The Lagrange dual problem is as convex optimization problem, since the objective is concave and the constraint is convex. This is the case whether or not the primal problem is convex. 5.2 Strong duality and Slater s constraint qualification (5.2.3, p. 226) For a convex optimization problem in standard form, we usually have strong duality, but not always. The additional conditions needed are called constraint qualifications. To state these conditions in their full generality, we need some new definitions. So we ll start with some consequences that are easier to state and use: Corollary 11. For a convex optimization problem in standard form, if the domain 2 D = m i=0 dom f i is open 3, and there exists an x D such that Ax = b and f i (x) < 0 for i = 1,..., m (such a point is called strictly feasible), then strong duality holds. Moreover, if d >, then the dual optimum is attained that is, there exists a dual feasible (λ, ν ) with g(λ, ν ) = d = p. If f 1,..., f k are affine functions, it is sufficient to replace the strict inequality constraints for f 1,..., f k with inequality constraints f i (x) 0 for i = 1,..., k, while the other conditions remain the same. 2 Recall the domain of a function dom f i is the set where the function is defined, and so D is the set where all the functions in the optimization problem are defined. In particular, D is NOT the feasible set (though it contains the feasible set). 3 For example, R d is an open set, and is the domain we encounter most often.

10 10 5 Duality (BV Chapter 5) Simplified version appropriate for SVM: Corollary 12. For a convex optimization problem in standard form, if the domain of f 0 is open, all equality and inequality constraints are linear, and the problem is feasible (i.e. there is some point in the domain that satisfies all the constraints), then we have strong duality and, if d >, then the dual optimum is attained. For a more general statement, let s define the affine dimension of a set C as the dimension of its affine hull. We define the relative interior of the set C, denoted relint C, as the interior relative to aff C: relint C = {x C B(x, r) aff C C for some r > 0},. where B(x, r) = {x y x r}, is the ball of radius r and center x in the norm. Also, recall that for a convex optimization problem in standard form, the domain D is the intersection of the domain of the objective and the inequality constraint functions: D = m dom f i. i=0 Theorem 13. For a convex optimization problem, if there exists an x relint D such that Ax = b and f i (x) < 0 for i = 1,..., m (such a point is sometimes called strictly feasible), then strong duality holds and, if d >, then the dual optimum is attained. If f 1,..., f k are affine functions, it is sufficient to replace the strict inequality constraints for f 1,..., f k with inequality constraints f i (x) 0 for i = 1,..., k, while the other conditions remain the same. 5.3 Optimality Conditions (BV 5.5, p. 241) Complementary slackness (BV 5.5.2, p. 242) Suppose that the primal and dual optimal values are attained and equal (so, in particular, strong duality holds, but we re not assuming convexity). Let x be a primal optimal and (λ, ν ) be a dual optimal point. This means that f 0 (x ) = g(λ, ν ) ( = inf x f 0 (x ) + f 0 (x ). f 0 (x) + λ i f i (x) + I=1 λ i f i (x ) + ) νi h i (x) νi h i (x ) The first line states that the duality gap is zero, and the second line is the definition of the dual function. The third line follows since the infimum of the Lagrangian over x is less than or equal to its value at x = x. The last inequality follows from feasibility of λ and x (meaning λ 0, f i (x ) 0 and h i (x ) = 0, for all i. Thus the inequalities are actually equalities. We can draw two interesting conclusions: 1. Since the third line is an equality, x minimizes L(x, λ, ν ) over x. (Note: x may not be the unique minimizer of L(x, λ, ν ).)

11 5.3 Optimality Conditions (BV 5.5, p. 241) Since p ν i h i (x ) = 0 and each term in the sum λ i f i (x ) is 0, each must actually be 0. That is λ i f i (x ) = 0, i = 1,..., m. This condition is known as complementary slackness, and it holds for any primal x and any dual optimal (λ, ν ) when strong duality holds. Roughly speaking, it means the ith optimal Lagrange multiplier is zero unless the ith constraint is active at the optimum KKT optimality conditions for convex problems (BV 5.5.3, p. 243) Consider a standard form convex optimization problem for which f 0,..., f m, h 1,..., h p are differentiable (and therefore have open domains). Let x, λ, ν be any points that satisfy the following Karush-Kuhn-Tucker (KKT) conditions: 1. Primal and dual feasibility: f i ( x) 0, h i ( x) = 0, λ i 0 for all i. 2. Complementary slackness: λ i f i ( x) = 0 for all i. 3. First order condition: f 0 ( x) + m λ i f i ( x) + p ν i h i ( x) = 0. ) Then x and ( λ, ν are primal and dual optimal, respectively, with zero duality gap. To see this, note that the x is primal feasible, and L(x, λ, ν) is convex in x, since λ i 0. Thus the first order condition implies that x minimizes L(x, λ, ν) over x. So g( λ, ν) = L( x, λ, ν) = f 0 ( x) + λ i f i ( x) + ν i h i ( x) = f 0 ( x), ) where in the last line we use complementary slackness and h i ( x) = 0. Thus x and ( λ, ν have zero duality gap, and therefore are primal and dual optimal. In summary, for any convex optimization problem with differentiable objective and constraint functions, any points that satisfy the KKT conditions are primal and dual optimal and have zero duality gap.

### Lagrangian Duality and Convex Optimization

Lagrangian Duality and Convex Optimization David Rosenberg New York University February 11, 2015 David Rosenberg (New York University) DS-GA 1003 February 11, 2015 1 / 24 Introduction Why Convex Optimization?

### Convex Optimization & Lagrange Duality

Convex Optimization & Lagrange Duality Chee Wei Tan CS 8292 : Advanced Topics in Convex Optimization and its Applications Fall 2010 Outline Convex optimization Optimality condition Lagrange duality KKT

### CS-E4830 Kernel Methods in Machine Learning

CS-E4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This

### On the Method of Lagrange Multipliers

On the Method of Lagrange Multipliers Reza Nasiri Mahalati November 6, 2016 Most of what is in this note is taken from the Convex Optimization book by Stephen Boyd and Lieven Vandenberghe. This should

### Constrained Optimization and Lagrangian Duality

CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

### 5. Duality. Lagrangian

5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

### Convex Optimization Boyd & Vandenberghe. 5. Duality

5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

### Convex Optimization M2

Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization

### ICS-E4030 Kernel Methods in Machine Learning

ICS-E4030 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 28. September, 2016 Juho Rousu 28. September, 2016 1 / 38 Convex optimization Convex optimisation This

### Lecture: Duality of LP, SOCP and SDP

1/33 Lecture: Duality of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:

### Lecture: Duality.

Lecture: Duality http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghe s lecture notes Introduction 2/35 Lagrange dual problem weak and strong

### A Brief Review on Convex Optimization

A Brief Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one convex, two nonconvex sets): A Brief Review

### Lagrange duality. The Lagrangian. We consider an optimization program of the form

Lagrange duality Another way to arrive at the KKT conditions, and one which gives us some insight on solving constrained optimization problems, is through the Lagrange dual. The dual is a maximization

### CSCI : Optimization and Control of Networks. Review on Convex Optimization

CSCI7000-016: Optimization and Control of Networks Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one

### Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

### The Lagrangian L : R d R m R r R is an (easier to optimize) lower bound on the original problem:

HT05: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford Convex Optimization and slides based on Arthur Gretton s Advanced Topics in Machine Learning course

### I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

I.3. LMI DUALITY Didier HENRION henrion@laas.fr EECI Graduate School on Control Supélec - Spring 2010 Primal and dual For primal problem p = inf x g 0 (x) s.t. g i (x) 0 define Lagrangian L(x, z) = g 0

### Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 4 Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 2 4.1. Subgradients definition subgradient calculus duality and optimality conditions Shiqian

### 14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

CS/ECE/ISyE 524 Introduction to Optimization Spring 2016 17 14. Duality ˆ Upper and lower bounds ˆ General duality ˆ Constraint qualifications ˆ Counterexample ˆ Complementary slackness ˆ Examples ˆ Sensitivity

### Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

CDS270 Maryam Fazel Lecture 2 Topics from Optimization and Duality Motivation network utility maximization (NUM) problem: consider a network with S sources (users), each sending one flow at rate x s, through

### Convex Optimization and Modeling

Convex Optimization and Modeling Duality Theory and Optimality Conditions 5th lecture, 12.05.2010 Jun.-Prof. Matthias Hein Program of today/next lecture Lagrangian and duality: the Lagrangian the dual

### Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

Convex Optimization Dani Yogatama School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA February 12, 2014 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12,

### subject to (x 2)(x 4) u,

Exercises Basic definitions 5.1 A simple example. Consider the optimization problem with variable x R. minimize x 2 + 1 subject to (x 2)(x 4) 0, (a) Analysis of primal problem. Give the feasible set, the

### Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Lagrange Duality Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline of Lecture Lagrangian Dual function Dual

### EE/AA 578, Univ of Washington, Fall Duality

7. Duality EE/AA 578, Univ of Washington, Fall 2016 Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

Subgradients subgradients strong and weak subgradient calculus optimality conditions via subgradients directional derivatives Prof. S. Boyd, EE364b, Stanford University Basic inequality recall basic inequality

### Primal/Dual Decomposition Methods

Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients

### Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities

Duality Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities Lagrangian Consider the optimization problem in standard form

### Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Karush-Kuhn-Tucker Conditions Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given a minimization problem Last time: duality min x subject to f(x) h i (x) 0, i = 1,... m l j (x) = 0, j =

### Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

Introduction to Machine Learning Lecture 7 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Convex Optimization Differentiation Definition: let f : X R N R be a differentiable function,

### Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers

Optimization for Communications and Networks Poompat Saengudomlert Session 4 Duality and Lagrange Multipliers P Saengudomlert (2015) Optimization Session 4 1 / 14 24 Dual Problems Consider a primal convex

### Subgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus

1/41 Subgradient Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes definition subgradient calculus duality and optimality conditions directional derivative Basic inequality

### Chap 2. Optimality conditions

Chap 2. Optimality conditions Version: 29-09-2012 2.1 Optimality conditions in unconstrained optimization Recall the definitions of global, local minimizer. Geometry of minimization Consider for f C 1

### Introduction to Mathematical Programming IE406. Lecture 10. Dr. Ted Ralphs

Introduction to Mathematical Programming IE406 Lecture 10 Dr. Ted Ralphs IE406 Lecture 10 1 Reading for This Lecture Bertsimas 4.1-4.3 IE406 Lecture 10 2 Duality Theory: Motivation Consider the following

### EE364a Review Session 5

EE364a Review Session 5 EE364a Review announcements: homeworks 1 and 2 graded homework 4 solutions (check solution to additional problem 1) scpd phone-in office hours: tuesdays 6-7pm (650-723-1156) 1 Complementary

### Convex Functions. Pontus Giselsson

Convex Functions Pontus Giselsson 1 Today s lecture lower semicontinuity, closure, convex hull convexity preserving operations precomposition with affine mapping infimal convolution image function supremum

### Lecture 18: Optimization Programming

Fall, 2016 Outline Unconstrained Optimization 1 Unconstrained Optimization 2 Equality-constrained Optimization Inequality-constrained Optimization Mixture-constrained Optimization 3 Quadratic Programming

### 4TE3/6TE3. Algorithms for. Continuous Optimization

4TE3/6TE3 Algorithms for Continuous Optimization (Duality in Nonlinear Optimization ) Tamás TERLAKY Computing and Software McMaster University Hamilton, January 2004 terlaky@mcmaster.ca Tel: 27780 Optimality

### 12. Interior-point methods

12. Interior-point methods Convex Optimization Boyd & Vandenberghe inequality constrained minimization logarithmic barrier function and central path barrier method feasibility and phase I methods complexity

### A function(al) f is convex if dom f is a convex set, and. f(θx + (1 θ)y) < θf(x) + (1 θ)f(y) f(x) = x 3

Convex functions The domain dom f of a functional f : R N R is the subset of R N where f is well-defined. A function(al) f is convex if dom f is a convex set, and f(θx + (1 θ)y) θf(x) + (1 θ)f(y) for all

### Optimality Conditions for Constrained Optimization

72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

### Optimization for Machine Learning

Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

### Lagrangian Duality Theory

Lagrangian Duality Theory Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye Chapter 14.1-4 1 Recall Primal and Dual

### Linear and non-linear programming

Linear and non-linear programming Benjamin Recht March 11, 2005 The Gameplan Constrained Optimization Convexity Duality Applications/Taxonomy 1 Constrained Optimization minimize f(x) subject to g j (x)

### Convex Optimization in Communications and Signal Processing

Convex Optimization in Communications and Signal Processing Prof. Dr.-Ing. Wolfgang Gerstacker 1 University of Erlangen-Nürnberg Institute for Digital Communications National Technical University of Ukraine,

### Lecture Notes on Support Vector Machine

Lecture Notes on Support Vector Machine Feng Li fli@sdu.edu.cn Shandong University, China 1 Hyperplane and Margin In a n-dimensional space, a hyper plane is defined by ω T x + b = 0 (1) where ω R n is

### Lecture 6: Conic Optimization September 8

IE 598: Big Data Optimization Fall 2016 Lecture 6: Conic Optimization September 8 Lecturer: Niao He Scriber: Juan Xu Overview In this lecture, we finish up our previous discussion on optimality conditions

### 3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

3. Convex functions Convex Optimization Boyd & Vandenberghe basic properties and examples operations that preserve convexity the conjugate function quasiconvex functions log-concave and log-convex functions

### In English, this means that if we travel on a straight line between any two points in C, then we never leave C.

Convex sets In this section, we will be introduced to some of the mathematical fundamentals of convex sets. In order to motivate some of the definitions, we will look at the closest point problem from

### Finite Dimensional Optimization Part III: Convex Optimization 1

John Nachbar Washington University March 21, 2017 Finite Dimensional Optimization Part III: Convex Optimization 1 1 Saddle points and KKT. These notes cover another important approach to optimization,

### Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be

### EC9A0: Pre-sessional Advanced Mathematics Course. Lecture Notes: Unconstrained Optimisation By Pablo F. Beker 1

EC9A0: Pre-sessional Advanced Mathematics Course Lecture Notes: Unconstrained Optimisation By Pablo F. Beker 1 1 Infimum and Supremum Definition 1. Fix a set Y R. A number α R is an upper bound of Y if

### The Karush-Kuhn-Tucker (KKT) conditions

The Karush-Kuhn-Tucker (KKT) conditions In this section, we will give a set of sufficient (and at most times necessary) conditions for a x to be the solution of a given convex optimization problem. These

### CONVEX OPTIMIZATION, DUALITY, AND THEIR APPLICATION TO SUPPORT VECTOR MACHINES. Contents 1. Introduction 1 2. Convex Sets

CONVEX OPTIMIZATION, DUALITY, AND THEIR APPLICATION TO SUPPORT VECTOR MACHINES DANIEL HENDRYCKS Abstract. This paper develops the fundamentals of convex optimization and applies them to Support Vector

### Optimization and Optimal Control in Banach Spaces

Optimization and Optimal Control in Banach Spaces Bernhard Schmitzer October 19, 2017 1 Convex non-smooth optimization with proximal operators Remark 1.1 (Motivation). Convex optimization: easier to solve,

### 3. Convex functions. basic properties and examples. operations that preserve convexity. the conjugate function. quasiconvex functions

3. Convex functions Convex Optimization Boyd & Vandenberghe basic properties and examples operations that preserve convexity the conjugate function quasiconvex functions log-concave and log-convex functions

### Math 273a: Optimization Subgradients of convex functions

Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 42 Subgradients Assumptions

### Applied Lagrange Duality for Constrained Optimization

Applied Lagrange Duality for Constrained Optimization February 12, 2002 Overview The Practical Importance of Duality ffl Review of Convexity ffl A Separating Hyperplane Theorem ffl Definition of the Dual

### Lecture 7: Convex Optimizations

Lecture 7: Convex Optimizations Radu Balan, David Levermore March 29, 2018 Convex Sets. Convex Functions A set S R n is called a convex set if for any points x, y S the line segment [x, y] := {tx + (1

### Optimisation convexe: performance, complexité et applications.

Optimisation convexe: performance, complexité et applications. Introduction, convexité, dualité. A. d Aspremont. M2 OJME: Optimisation convexe. 1/128 Today Convex optimization: introduction Course organization

### HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.

HW1 solutions Exercise 1 (Some sets of probability distributions.) Let x be a real-valued random variable with Prob(x = a i ) = p i, i = 1,..., n, where a 1 < a 2 < < a n. Of course p R n lies in the standard

### UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009

UC Berkeley Department of Electrical Engineering and Computer Science EECS 227A Nonlinear and Convex Optimization Solutions 5 Fall 2009 Reading: Boyd and Vandenberghe, Chapter 5 Solution 5.1 Note that

### Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu

### 1. f(β) 0 (that is, β is a feasible point for the constraints)

xvi 2. The lasso for linear models 2.10 Bibliographic notes Appendix Convex optimization with constraints In this Appendix we present an overview of convex optimization concepts that are particularly useful

### Constrained optimization

Constrained optimization DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Compressed sensing Convex constrained

### CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS

CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS A Dissertation Submitted For The Award of the Degree of Master of Philosophy in Mathematics Neelam Patel School of Mathematics

L. Vandenberghe EE236C (Spring 2016) 1. Gradient method gradient method, first-order methods quadratic bounds on convex functions analysis of gradient method 1-1 Approximate course outline First-order

### Convex Functions. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Convex Functions Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline of Lecture Definition convex function Examples

Additional Homework Problems Robert M. Freund April, 2004 2004 Massachusetts Institute of Technology. 1 2 1 Exercises 1. Let IR n + denote the nonnegative orthant, namely IR + n = {x IR n x j ( ) 0,j =1,...,n}.

### Support Vector Machines: Maximum Margin Classifiers

Support Vector Machines: Maximum Margin Classifiers Machine Learning and Pattern Recognition: September 16, 2008 Piotr Mirowski Based on slides by Sumit Chopra and Fu-Jie Huang 1 Outline What is behind

### Lecture 3 January 28

EECS 28B / STAT 24B: Advanced Topics in Statistical LearningSpring 2009 Lecture 3 January 28 Lecturer: Pradeep Ravikumar Scribe: Timothy J. Wheeler Note: These lecture notes are still rough, and have only

### Machine Learning. Support Vector Machines. Manfred Huber

Machine Learning Support Vector Machines Manfred Huber 2015 1 Support Vector Machines Both logistic regression and linear discriminant analysis learn a linear discriminant function to separate the data

### Generalization to inequality constrained problem. Maximize

Lecture 11. 26 September 2006 Review of Lecture #10: Second order optimality conditions necessary condition, sufficient condition. If the necessary condition is violated the point cannot be a local minimum

### Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016

Lecture 1: Entropy, convexity, and matrix scaling CSE 599S: Entropy optimality, Winter 2016 Instructor: James R. Lee Last updated: January 24, 2016 1 Entropy Since this course is about entropy maximization,

### UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems Robert M. Freund February 2016 c 2016 Massachusetts Institute of Technology. All rights reserved. 1 1 Introduction

### Homework Set #6 - Solutions

EE 15 - Applications of Convex Optimization in Signal Processing and Communications Dr Andre Tkacenko JPL Third Term 11-1 Homework Set #6 - Solutions 1 a The feasible set is the interval [ 4] The unique

### Linear Programming. Larry Blume Cornell University, IHS Vienna and SFI. Summer 2016

Linear Programming Larry Blume Cornell University, IHS Vienna and SFI Summer 2016 These notes derive basic results in finite-dimensional linear programming using tools of convex analysis. Most sources

### Lecture Notes 9: Constrained Optimization

Optimization-based data analysis Fall 017 Lecture Notes 9: Constrained Optimization 1 Compressed sensing 1.1 Underdetermined linear inverse problems Linear inverse problems model measurements of the form

### Lecture Note 5: Semidefinite Programming for Stability Analysis

ECE7850: Hybrid Systems:Theory and Applications Lecture Note 5: Semidefinite Programming for Stability Analysis Wei Zhang Assistant Professor Department of Electrical and Computer Engineering Ohio State

### Subgradient Descent. David S. Rosenberg. New York University. February 7, 2018

Subgradient Descent David S. Rosenberg New York University February 7, 2018 David S. Rosenberg (New York University) DS-GA 1003 / CSCI-GA 2567 February 7, 2018 1 / 43 Contents 1 Motivation and Review:

### Existence of minimizers

Existence of imizers We have just talked a lot about how to find the imizer of an unconstrained convex optimization problem. We have not talked too much, at least not in concrete mathematical terms, about

### 2.098/6.255/ Optimization Methods Practice True/False Questions

2.098/6.255/15.093 Optimization Methods Practice True/False Questions December 11, 2009 Part I For each one of the statements below, state whether it is true or false. Include a 1-3 line supporting sentence

### minimize x subject to (x 2)(x 4) u,

Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for

### Numerical Optimization

Constrained Optimization Computer Science and Automation Indian Institute of Science Bangalore 560 012, India. NPTEL Course on Constrained Optimization Constrained Optimization Problem: min h j (x) 0,

### Nonlinear Programming and the Kuhn-Tucker Conditions

Nonlinear Programming and the Kuhn-Tucker Conditions The Kuhn-Tucker (KT) conditions are first-order conditions for constrained optimization problems, a generalization of the first-order conditions we

### Convex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version

Convex Optimization Theory Chapter 5 Exercises and Solutions: Extended Version Dimitri P. Bertsekas Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts http://www.athenasc.com

### Tutorial on Convex Optimization for Engineers Part II

Tutorial on Convex Optimization for Engineers Part II M.Sc. Jens Steinwandt Communications Research Laboratory Ilmenau University of Technology PO Box 100565 D-98684 Ilmenau, Germany jens.steinwandt@tu-ilmenau.de

### Summer School: Semidefinite Optimization

Summer School: Semidefinite Optimization Christine Bachoc Université Bordeaux I, IMB Research Training Group Experimental and Constructive Algebra Haus Karrenberg, Sept. 3 - Sept. 7, 2012 Duality Theory

### Inequality Constraints

Chapter 2 Inequality Constraints 2.1 Optimality Conditions Early in multivariate calculus we learn the significance of differentiability in finding minimizers. In this section we begin our study of the

### Machine Learning. Lecture 6: Support Vector Machine. Feng Li.

Machine Learning Lecture 6: Support Vector Machine Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2018 Warm Up 2 / 80 Warm Up (Contd.)

### Lecture 5. Theorems of Alternatives and Self-Dual Embedding

IE 8534 1 Lecture 5. Theorems of Alternatives and Self-Dual Embedding IE 8534 2 A system of linear equations may not have a solution. It is well known that either Ax = c has a solution, or A T y = 0, c

### Duality Theory of Constrained Optimization

Duality Theory of Constrained Optimization Robert M. Freund April, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 2 1 The Practical Importance of Duality Duality is pervasive

### EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 17 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory May 29, 2012 Andre Tkacenko

### CO 250 Final Exam Guide

Spring 2017 CO 250 Final Exam Guide TABLE OF CONTENTS richardwu.ca CO 250 Final Exam Guide Introduction to Optimization Kanstantsin Pashkovich Spring 2017 University of Waterloo Last Revision: March 4,

### Lecture 8 Plus properties, merit functions and gap functions. September 28, 2008

Lecture 8 Plus properties, merit functions and gap functions September 28, 2008 Outline Plus-properties and F-uniqueness Equation reformulations of VI/CPs Merit functions Gap merit functions FP-I book:

### A Tutorial on Convex Optimization II: Duality and Interior Point Methods

A Tutorial on Convex Optimization II: Duality and Interior Point Methods Haitham Hindi Palo Alto Research Center (PARC), Palo Alto, California 94304 email: hhindi@parc.com Abstract In recent years, convex

### Appendix PRELIMINARIES 1. THEOREMS OF ALTERNATIVES FOR SYSTEMS OF LINEAR CONSTRAINTS

Appendix PRELIMINARIES 1. THEOREMS OF ALTERNATIVES FOR SYSTEMS OF LINEAR CONSTRAINTS Here we consider systems of linear constraints, consisting of equations or inequalities or both. A feasible solution