# Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Save this PDF as:

Size: px
Start display at page:

## Transcription

1 Karush-Kuhn-Tucker Conditions Lecturer: Ryan Tibshirani Convex Optimization /

2 Given a minimization problem Last time: duality min x subject to f(x) h i (x) 0, i = 1,... m l j (x) = 0, j = 1,... r we defined the Lagrangian: L(x, u, v) = f(x) + and Lagrange dual function: m u i h i (x) + g(u, v) = min x L(x, u, v) r v j l j (x) j=1 2

3 The subsequent dual problem is: Important properties: max u,v g(u, v) subject to u 0 Dual problem is always convex, i.e., g is always concave (even if primal problem is not convex) The primal and dual optimal values, f and g, always satisfy weak duality: f g Slater s condition: for convex primal, if there is an x such that h 1 (x) < 0,... h m (x) < 0 and l 1 (x) = 0,... l r (x) = 0 then strong duality holds: f = g. Can be further refined to strict inequalities over the nonaffine h i, i = 1,... m 3

4 Outline Today: KKT conditions Examples Constrained and Lagrange forms Uniqueness with l 1 penalties 4

5 Given general problem Karush-Kuhn-Tucker conditions min x subject to f(x) h i (x) 0, i = 1,... m l j (x) = 0, j = 1,... r The Karush-Kuhn-Tucker conditions or KKT conditions are: ( m r ) 0 f(x) + u i h i (x) + v j l j (x) (stationarity) u i h i (x) = 0 for all i j=1 h i (x) 0, l j (x) = 0 for all i, j u i 0 for all i (complementary slackness) (primal feasibility) (dual feasibility) 5

6 Necessity Let x and u, v be primal and dual solutions with zero duality gap (strong duality holds, e.g., under Slater s condition). Then f(x ) = g(u, v ) = min x f(x) + f(x ) + f(x ) m u i h i (x) + m u i h i (x ) + r vj l j (x) j=1 r vj l j (x ) j=1 In other words, all these inequalities are actually equalities 6

7 Two things to learn from this: The point x minimizes L(x, u, v ) over x R n. Hence the subdifferential of L(x, u, v ) must contain 0 at x = x this is exactly the stationarity condition We must have m u i h i(x ) = 0, and since each term here is 0, this implies u i h i(x ) = 0 for every i this is exactly complementary slackness Primal and dual feasibility hold by virtue of optimality. Therefore: If x and u, v are primal and dual solutions, with zero duality gap, then x, u, v satisfy the KKT conditions (Note that this statement assumes nothing a priori about convexity of our problem, i.e., of f, h i, l j ) 7

8 Sufficiency If there exists x, u, v that satisfy the KKT conditions, then g(u, v ) = f(x ) + = f(x ) m u i h i (x ) + r vj l j (x ) j=1 where the first equality holds from stationarity, and the second holds from complementary slackness Therefore the duality gap is zero (and x and u, v are primal and dual feasible) so x and u, v are primal and dual optimal. Hence, we ve shown: If x and u, v satisfy the KKT conditions, then x and u, v are primal and dual solutions 8

9 Putting it together In summary, KKT conditions: always sufficient necessary under strong duality Putting it together: For a problem with strong duality (e.g., assume Slater s condition: convex problem and there exists x strictly satisfying nonaffine inequality contraints), x and u, v are primal and dual solutions x and u, v satisfy the KKT conditions (Warning, concerning the stationarity condition: for a differentiable function f, we cannot use f(x) = { f(x)} unless f is convex!) 9

10 10 What s in a name? Older folks will know these as the KT (Kuhn-Tucker) conditions: First appeared in publication by Kuhn and Tucker in 1951 Later people found out that Karush had the conditions in his unpublished master s thesis of 1939 For unconstrained problems, the KKT conditions are nothing more than the subgradient optimality condition For general convex problems, the KKT conditions could have been derived entirely from studying optimality via subgradients 0 f(x ) + m N {hi 0}(x ) + r N {lj =0}(x ) j=1 where recall N C (x) is the normal cone of C at x

11 11 Example: quadratic with equality constraints Consider for Q 0, min 2 xt Qx + c T x subject to Ax = 0 x 1 E.g., as we will see, this corresponds to Newton step for equalityconstrained problem min x f(x) subject to Ax = b Convex problem, no inequality constraints, so by KKT conditions: x is a solution if and only if [ ] ] Q A T A 0 ] [ x u = [ c 0 for some u. Linear system combines stationarity, primal feasibility (complementary slackness and dual feasibility are vacuous)

12 12 Example: water-filling Example from B & V page 245: consider problem min x n log(α i + x i ) subject to x 0, 1 T x = 1 Information theory: think of log(α i + x i ) as communication rate of ith channel. KKT conditions: 1/(α i + x i ) u i + v = 0, i = 1,... n Eliminate u: u i x i = 0, i = 1,... n, x 0, 1 T x = 1, u 0 1/(α i + x i ) v, i = 1,... n x i (v 1/(α i + x i )) = 0, i = 1,... n, x 0, 1 T x = 1

13 13 Can argue directly stationarity and complementary slackness imply { 1/v α i if v < 1/α i x i = = max{0, 1/v α i }, i = 1,... n 0 if v 1/α i Still need x to be feasible, i.e., 1 T x = 1, and this gives n max{0, 1/v α i } = 1 Univariate equation, piecewise linear in 1/v and not hard to solve This reduced problem is called water-filling (From B & V page 246) 1/ν x i α i i

14 14 Example: support vector machines Given y { 1, 1} n, and X R n p, the support vector machine problem is: min β,β 0,ξ subject to 1 2 β C n ξ i ξ i 0, i = 1,... n y i (x T i β + β 0 ) 1 ξ i, i = 1,... n Introduce dual variables v, w 0. KKT stationarity condition: n 0 = w i y i, β = Complementary slackness: n w i y i x i, w = C1 v v i ξ i = 0, w i ( 1 ξi y i (x T i β + β 0 ) ) = 0, i = 1,... n

15 15 Hence at optimality we have β = n w iy i x i, and w i is nonzero only if y i (x T i β + β 0) = 1 ξ i. Such points i are called the support points For support point i, if ξ i = 0, then x i lies on edge of margin, and w i (0, C]; For support point i, if ξ i 0, then x i lies on wrong side of margin, and w i = C x T β + β 0 =0 ξ 4 ξ5 ξ ξ 3 1 ξ 2 M = 1 β M = 1 β margin KKT conditions do not really give us a way to find solution, but gives a better understanding In fact, we can use this to screen away non-support points before performing optimization

16 16 Constrained and Lagrange forms Often in statistics and machine learning we ll switch back and forth between constrained form, where t R is a tuning parameter, min x f(x) subject to h(x) t (C) and Lagrange form, where λ 0 is a tuning parameter, min x f(x) + λ h(x) (L) and claim these are equivalent. Is this true (assuming convex f, h)? (C) to (L): if problem (C) is strictly feasible, then strong duality holds, and there exists some λ 0 (dual solution) such that any solution x in (C) minimizes so x is also a solution in (L) f(x) + λ (h(x) t)

17 17 (L) to (C): if x is a solution in (L), then the KKT conditions for (C) are satisfied by taking t = h(x ), so x is a solution in (C) Conclusion: {solutions in (L)} λ 0 λ 0 {solutions in (L)} {solutions in (C)} t {solutions in (C)} t such that (C) is strictly feasible This is nearly a perfect equivalence. Note: when the only value of t that leads to a feasible but not strictly feasible constraint set is t = 0, then we do get perfect equivalence So, e.g., if h 0, and (C), (L) are feasible for all t, λ 0, then we do get perfect equivalence

18 Uniqueness in l 1 penalized problems Using the KKT conditions and simple probability arguments, we have the following (perhaps surprising) result: Theorem: Let f be differentiable and strictly convex, let X R n p, λ > 0. Consider min β f(xβ) + λ β 1 If the entries of X are drawn from a continuous probability distribution (on R np ), then w.p. 1 there is a unique solution and it has at most min{n, p} nonzero components Remark: here f must be strictly convex, but no restrictions on the dimensions of X (we could have p n) Proof: the KKT conditions are { X T {sign(β i )} if β i 0 f(xβ) = λs, s i [ 1, 1] if β i = 0, i = 1,... n 18

19 19 Note that Xβ, s are unique. Define S = {j : Xj T f(xβ) = λ}, also unique, and note that any solution satisfies β i = 0 for all i / S First assume that rank(x S ) < S (here X R n S, submatrix of X corresponding to columns in S). Then for some i S, X i = for constants c j R, so that s i X i = j S\{i} j S\{i} c j X j s j c j λ(s j X j ) Hence taking an inner product with f(xβ), λ = j S\{i} (s i s j c j )λ, i.e., j S\{i} s i s j c j = 1

20 20 In other words, we ve proved that rank(x S ) < S implies s i X i is in the affine span of s j X j, j S \ {i} (subspace of dimension < n) We say that the matrix X has columns in general position if any affine subspace L of dimension k < n does not contain more than k + 1 elements; of {±X 1,... ± X p } (excluding antipodal pairs) It is straightforward to show that, if the entries of X have a density over R np, then X is in general position with probability 1 X 3 X 4 X 2 X 1

21 21 Therefore, if entries of X are drawn from continuous probability distribution, any solution must satisfy rank(x S ) = S Recalling the KKT conditions, this means the number of nonzero components in any solution at most S min{n, p}. Further, we can reduce our optimization problem (by partially solving) to min β S R S f(x S β S ) + λ β S 1 Finally, strict convexity implies uniqueness of the solution in this problem, and hence in our original problem

22 22 Back to duality One of the most important uses of duality is that, under strong duality, we can characterize primal solutions from dual solutions Recall that under strong duality, the KKT conditions are necessary for optimality. Given dual solutions u, v, any primal solution x satisfies the stationarity condition 0 f(x ) + m u i h i (x ) + In other words, x solves min x L(x, u, v ) r vi l j (x ) j=1 Generally, this reveals a characterization of primal solutions In particular, if this is satisfied uniquely (i.e., above problem has a unique minimizer), then the corresponding point must be the primal solution

23 23 References S. Boyd and L. Vandenberghe (2004), Convex optimization, Chapter 5 R. T. Rockafellar (1970), Convex analysis, Chapters 28 30

### Lecture 6: Conic Optimization September 8

IE 598: Big Data Optimization Fall 2016 Lecture 6: Conic Optimization September 8 Lecturer: Niao He Scriber: Juan Xu Overview In this lecture, we finish up our previous discussion on optimality conditions

### Optimization for Machine Learning

Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

### 14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

CS/ECE/ISyE 524 Introduction to Optimization Spring 2016 17 14. Duality ˆ Upper and lower bounds ˆ General duality ˆ Constraint qualifications ˆ Counterexample ˆ Complementary slackness ˆ Examples ˆ Sensitivity

### On the Method of Lagrange Multipliers

On the Method of Lagrange Multipliers Reza Nasiri Mahalati November 6, 2016 Most of what is in this note is taken from the Convex Optimization book by Stephen Boyd and Lieven Vandenberghe. This should

### Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

### The Karush-Kuhn-Tucker (KKT) conditions

The Karush-Kuhn-Tucker (KKT) conditions In this section, we will give a set of sufficient (and at most times necessary) conditions for a x to be the solution of a given convex optimization problem. These

### Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

CDS270 Maryam Fazel Lecture 2 Topics from Optimization and Duality Motivation network utility maximization (NUM) problem: consider a network with S sources (users), each sending one flow at rate x s, through

### Optimality, Duality, Complementarity for Constrained Optimization

Optimality, Duality, Complementarity for Constrained Optimization Stephen Wright University of Wisconsin-Madison May 2014 Wright (UW-Madison) Optimality, Duality, Complementarity May 2014 1 / 41 Linear

### Convex Optimization Overview (cnt d)

Conve Optimization Overview (cnt d) Chuong B. Do November 29, 2009 During last week s section, we began our study of conve optimization, the study of mathematical optimization problems of the form, minimize

### Convex Optimization and Modeling

Convex Optimization and Modeling Duality Theory and Optimality Conditions 5th lecture, 12.05.2010 Jun.-Prof. Matthias Hein Program of today/next lecture Lagrangian and duality: the Lagrangian the dual

### Convex Optimization Lecture 6: KKT Conditions, and applications

Convex Optimization Lecture 6: KKT Conditions, and applications Dr. Michel Baes, IFOR / ETH Zürich Quick recall of last week s lecture Various aspects of convexity: The set of minimizers is convex. Convex

### Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Dual methods and ADMM Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given f : R n R, the function is called its conjugate Recall conjugate functions f (y) = max x R n yt x f(x)

### Coordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /

Coordinate descent Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Adding to the toolbox, with stats and ML in mind We ve seen several general and useful minimization tools First-order methods

### Chap 2. Optimality conditions

Chap 2. Optimality conditions Version: 29-09-2012 2.1 Optimality conditions in unconstrained optimization Recall the definitions of global, local minimizer. Geometry of minimization Consider for f C 1

### 10701 Recitation 5 Duality and SVM. Ahmed Hefny

10701 Recitation 5 Duality and SVM Ahmed Hefny Outline Langrangian and Duality The Lagrangian Duality Eamples Support Vector Machines Primal Formulation Dual Formulation Soft Margin and Hinge Loss Lagrangian

### Sparsity and the Lasso

Sparsity and the Lasso Statistical Machine Learning, Spring 205 Ryan Tibshirani (with Larry Wasserman Regularization and the lasso. A bit of background If l 2 was the norm of the 20th century, then l is

### Support Vector Machines

EE 17/7AT: Optimization Models in Engineering Section 11/1 - April 014 Support Vector Machines Lecturer: Arturo Fernandez Scribe: Arturo Fernandez 1 Support Vector Machines Revisited 1.1 Strictly) Separable

### Newton s Method. Ryan Tibshirani Convex Optimization /36-725

Newton s Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, Properties and examples: f (y) = max x

### Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2015 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

### Support Vector Machines and Kernel Methods

2018 CS420 Machine Learning, Lecture 3 Hangout from Prof. Andrew Ng. http://cs229.stanford.edu/notes/cs229-notes3.pdf Support Vector Machines and Kernel Methods Weinan Zhang Shanghai Jiao Tong University

### Lecture 8. Strong Duality Results. September 22, 2008

Strong Duality Results September 22, 2008 Outline Lecture 8 Slater Condition and its Variations Convex Objective with Linear Inequality Constraints Quadratic Objective over Quadratic Constraints Representation

### Lecture 23: November 21

10-725/36-725: Convex Optimization Fall 2016 Lecturer: Ryan Tibshirani Lecture 23: November 21 Scribes: Yifan Sun, Ananya Kumar, Xin Lu Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:

### Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

Quiz Discussion IE417: Nonlinear Programming: Lecture 12 Jeff Linderoth Department of Industrial and Systems Engineering Lehigh University 16th March 2006 Motivation Why do we care? We are interested in

### Primal-Dual Interior-Point Methods. Ryan Tibshirani Convex Optimization /36-725

Primal-Dual Interior-Point Methods Ryan Tibshirani Convex Optimization 10-725/36-725 Given the problem Last time: barrier method min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h i, i = 1,...

### Support Vector Machines

Wien, June, 2010 Paul Hofmarcher, Stefan Theussl, WU Wien Hofmarcher/Theussl SVM 1/21 Linear Separable Separating Hyperplanes Non-Linear Separable Soft-Margin Hyperplanes Hofmarcher/Theussl SVM 2/21 (SVM)

### Newton s Method. Javier Peña Convex Optimization /36-725

Newton s Method Javier Peña Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, f ( (y) = max y T x f(x) ) x Properties and

### Date: July 5, Contents

2 Lagrange Multipliers Date: July 5, 2001 Contents 2.1. Introduction to Lagrange Multipliers......... p. 2 2.2. Enhanced Fritz John Optimality Conditions...... p. 14 2.3. Informative Lagrange Multipliers...........

### Tutorial on Convex Optimization for Engineers Part II

Tutorial on Convex Optimization for Engineers Part II M.Sc. Jens Steinwandt Communications Research Laboratory Ilmenau University of Technology PO Box 100565 D-98684 Ilmenau, Germany jens.steinwandt@tu-ilmenau.de

Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties

### SECTION C: CONTINUOUS OPTIMISATION LECTURE 9: FIRST ORDER OPTIMALITY CONDITIONS FOR CONSTRAINED NONLINEAR PROGRAMMING

Nf SECTION C: CONTINUOUS OPTIMISATION LECTURE 9: FIRST ORDER OPTIMALITY CONDITIONS FOR CONSTRAINED NONLINEAR PROGRAMMING f(x R m g HONOUR SCHOOL OF MATHEMATICS, OXFORD UNIVERSITY HILARY TERM 5, DR RAPHAEL

### Lecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.

MA 796S: Convex Optimization and Interior Point Methods October 8, 2007 Lecture 1 Lecturer: Kartik Sivaramakrishnan Scribe: Kartik Sivaramakrishnan 1 Conic programming Consider the conic program min s.t.

### Lagrangian Duality for Dummies

Lagrangian Duality for Dummies David Knowles November 13, 2010 We want to solve the following optimisation problem: f 0 () (1) such that f i () 0 i 1,..., m (2) For now we do not need to assume conveity.

### Linear Programming: Simplex

Linear Programming: Simplex Stephen J. Wright 1 2 Computer Sciences Department, University of Wisconsin-Madison. IMA, August 2016 Stephen Wright (UW-Madison) Linear Programming: Simplex IMA, August 2016

### Applied Lagrange Duality for Constrained Optimization

Applied Lagrange Duality for Constrained Optimization February 12, 2002 Overview The Practical Importance of Duality ffl Review of Convexity ffl A Separating Hyperplane Theorem ffl Definition of the Dual

### Lecture Support Vector Machine (SVM) Classifiers

Introduction to Machine Learning Lecturer: Amir Globerson Lecture 6 Fall Semester Scribe: Yishay Mansour 6.1 Support Vector Machine (SVM) Classifiers Classification is one of the most important tasks in

### 10 Numerical methods for constrained problems

10 Numerical methods for constrained problems min s.t. f(x) h(x) = 0 (l), g(x) 0 (m), x X The algorithms can be roughly divided the following way: ˆ primal methods: find descent direction keeping inside

### Lagrangian Duality. Richard Lusby. Department of Management Engineering Technical University of Denmark

Lagrangian Duality Richard Lusby Department of Management Engineering Technical University of Denmark Today s Topics (jg Lagrange Multipliers Lagrangian Relaxation Lagrangian Duality R Lusby (42111) Lagrangian

### Lecture 5. Theorems of Alternatives and Self-Dual Embedding

IE 8534 1 Lecture 5. Theorems of Alternatives and Self-Dual Embedding IE 8534 2 A system of linear equations may not have a solution. It is well known that either Ax = c has a solution, or A T y = 0, c

### EE 227A: Convex Optimization and Applications October 14, 2008

EE 227A: Convex Optimization and Applications October 14, 2008 Lecture 13: SDP Duality Lecturer: Laurent El Ghaoui Reading assignment: Chapter 5 of BV. 13.1 Direct approach 13.1.1 Primal problem Consider

### Primal-Dual Interior-Point Methods

Primal-Dual Interior-Point Methods Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 Outline Today: Primal-dual interior-point method Special case: linear programming

### Assignment 1: From the Definition of Convexity to Helley Theorem

Assignment 1: From the Definition of Convexity to Helley Theorem Exercise 1 Mark in the following list the sets which are convex: 1. {x R 2 : x 1 + i 2 x 2 1, i = 1,..., 10} 2. {x R 2 : x 2 1 + 2ix 1x

### Lecture 13: Duality Uses and Correspondences

10-725/36-725: Conve Optimization Fall 2016 Lectre 13: Dality Uses and Correspondences Lectrer: Ryan Tibshirani Scribes: Yichong X, Yany Liang, Yanning Li Note: LaTeX template cortesy of UC Berkeley EECS

### Support Vector Machines for Classification and Regression. 1 Linearly Separable Data: Hard Margin SVMs

E0 270 Machine Learning Lecture 5 (Jan 22, 203) Support Vector Machines for Classification and Regression Lecturer: Shivani Agarwal Disclaimer: These notes are a brief summary of the topics covered in

### Solving Dual Problems

Lecture 20 Solving Dual Problems We consider a constrained problem where, in addition to the constraint set X, there are also inequality and linear equality constraints. Specifically the minimization problem

### CONVEX FUNCTIONS AND OPTIMIZATION TECHINIQUES A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF

CONVEX FUNCTIONS AND OPTIMIZATION TECHINIQUES A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE IN MATHEMATICS SUBMITTED TO NATIONAL INSTITUTE OF TECHNOLOGY,

### The fundamental theorem of linear programming

The fundamental theorem of linear programming Michael Tehranchi June 8, 2017 This note supplements the lecture notes of Optimisation The statement of the fundamental theorem of linear programming and the

### Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method

Primal-Dual Interior-Point Methods for Linear Programming based on Newton s Method Robert M. Freund March, 2004 2004 Massachusetts Institute of Technology. The Problem The logarithmic barrier approach

### Mathematical Foundations -1- Constrained Optimization. Constrained Optimization. An intuitive approach 2. First Order Conditions (FOC) 7

Mathematical Foundations -- Constrained Optimization Constrained Optimization An intuitive approach First Order Conditions (FOC) 7 Constraint qualifications 9 Formal statement of the FOC for a maximum

### LP Duality: outline. Duality theory for Linear Programming. alternatives. optimization I Idea: polyhedra

LP Duality: outline I Motivation and definition of a dual LP I Weak duality I Separating hyperplane theorem and theorems of the alternatives I Strong duality and complementary slackness I Using duality

### Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Newton Method and Self-Concordance October 23, 2008 Outline Lecture 15 Self-concordance Notion Self-concordant Functions Operations Preserving Self-concordance Properties of Self-concordant Functions Implications

### The lasso. Patrick Breheny. February 15. The lasso Convex optimization Soft thresholding

Patrick Breheny February 15 Patrick Breheny High-Dimensional Data Analysis (BIOS 7600) 1/24 Introduction Last week, we introduced penalized regression and discussed ridge regression, in which the penalty

### CONVEX OPTIMIZATION, DUALITY, AND THEIR APPLICATION TO SUPPORT VECTOR MACHINES. Contents 1. Introduction 1 2. Convex Sets

CONVEX OPTIMIZATION, DUALITY, AND THEIR APPLICATION TO SUPPORT VECTOR MACHINES DANIEL HENDRYCKS Abstract. This paper develops the fundamentals of convex optimization and applies them to Support Vector

### Support Vector Machines

CS229 Lecture notes Andrew Ng Part V Support Vector Machines This set of notes presents the Support Vector Machine (SVM) learning algorithm. SVMs are among the best (and many believe is indeed the best)

### Symmetric and Asymmetric Duality

journal of mathematical analysis and applications 220, 125 131 (1998) article no. AY975824 Symmetric and Asymmetric Duality Massimo Pappalardo Department of Mathematics, Via Buonarroti 2, 56127, Pisa,

### You should be able to...

Lecture Outline Gradient Projection Algorithm Constant Step Length, Varying Step Length, Diminishing Step Length Complexity Issues Gradient Projection With Exploration Projection Solving QPs: active set

### STATIC LECTURE 4: CONSTRAINED OPTIMIZATION II - KUHN TUCKER THEORY

STATIC LECTURE 4: CONSTRAINED OPTIMIZATION II - KUHN TUCKER THEORY UNIVERSITY OF MARYLAND: ECON 600 1. Some Eamples 1 A general problem that arises countless times in economics takes the form: (Verbally):

### LAGRANGIAN TRANSFORMATION IN CONVEX OPTIMIZATION

LAGRANGIAN TRANSFORMATION IN CONVEX OPTIMIZATION ROMAN A. POLYAK Abstract. We introduce the Lagrangian Transformation(LT) and develop a general LT method for convex optimization problems. A class Ψ of

### Duality of LPs and Applications

Lecture 6 Duality of LPs and Applications Last lecture we introduced duality of linear programs. We saw how to form duals, and proved both the weak and strong duality theorems. In this lecture we will

### Gaps in Support Vector Optimization

Gaps in Support Vector Optimization Nikolas List 1 (student author), Don Hush 2, Clint Scovel 2, Ingo Steinwart 2 1 Lehrstuhl Mathematik und Informatik, Ruhr-University Bochum, Germany nlist@lmi.rub.de

### A Tutorial on Convex Optimization II: Duality and Interior Point Methods

A Tutorial on Convex Optimization II: Duality and Interior Point Methods Haitham Hindi Palo Alto Research Center (PARC), Palo Alto, California 94304 email: hhindi@parc.com Abstract In recent years, convex

### Optimization Tutorial 1. Basic Gradient Descent

E0 270 Machine Learning Jan 16, 2015 Optimization Tutorial 1 Basic Gradient Descent Lecture by Harikrishna Narasimhan Note: This tutorial shall assume background in elementary calculus and linear algebra.

### Lecture 1: Introduction. Outline. B9824 Foundations of Optimization. Fall Administrative matters. 2. Introduction. 3. Existence of optima

B9824 Foundations of Optimization Lecture 1: Introduction Fall 2009 Copyright 2009 Ciamac Moallemi Outline 1. Administrative matters 2. Introduction 3. Existence of optima 4. Local theory of unconstrained

### Lecture 2: Convex Sets and Functions

Lecture 2: Convex Sets and Functions Hyang-Won Lee Dept. of Internet & Multimedia Eng. Konkuk University Lecture 2 Network Optimization, Fall 2015 1 / 22 Optimization Problems Optimization problems are

### Constrained optimization: direct methods (cont.)

Constrained optimization: direct methods (cont.) Jussi Hakanen Post-doctoral researcher jussi.hakanen@jyu.fi Direct methods Also known as methods of feasible directions Idea in a point x h, generate a

### Solution Methods. Richard Lusby. Department of Management Engineering Technical University of Denmark

Solution Methods Richard Lusby Department of Management Engineering Technical University of Denmark Lecture Overview (jg Unconstrained Several Variables Quadratic Programming Separable Programming SUMT

### Nonlinear Programming 3rd Edition. Theoretical Solutions Manual Chapter 6

Nonlinear Programming 3rd Edition Theoretical Solutions Manual Chapter 6 Dimitri P. Bertsekas Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts 1 NOTE This manual contains

### IE 5531 Midterm #2 Solutions

IE 5531 Midterm #2 s Prof. John Gunnar Carlsson November 9, 2011 Before you begin: This exam has 9 pages and a total of 5 problems. Make sure that all pages are present. To obtain credit for a problem,

### Module 04 Optimization Problems KKT Conditions & Solvers

Module 04 Optimization Problems KKT Conditions & Solvers Ahmad F. Taha EE 5243: Introduction to Cyber-Physical Systems Email: ahmad.taha@utsa.edu Webpage: http://engineering.utsa.edu/ taha/index.html September

### Advanced Mathematical Programming IE417. Lecture 24. Dr. Ted Ralphs

Advanced Mathematical Programming IE417 Lecture 24 Dr. Ted Ralphs IE417 Lecture 24 1 Reading for This Lecture Sections 11.2-11.2 IE417 Lecture 24 2 The Linear Complementarity Problem Given M R p p and

### Lecture 13: Constrained optimization

2010-12-03 Basic ideas A nonlinearly constrained problem must somehow be converted relaxed into a problem which we can solve (a linear/quadratic or unconstrained problem) We solve a sequence of such problems

### Lecture 17: Primal-dual interior-point methods part II

10-725/36-725: Convex Optimization Spring 2015 Lecture 17: Primal-dual interior-point methods part II Lecturer: Javier Pena Scribes: Pinchao Zhang, Wei Ma Note: LaTeX template courtesy of UC Berkeley EECS

### LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE

LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework - MC/MC Constrained optimization

### Numerical optimization

Numerical optimization Lecture 4 Alexander & Michael Bronstein tosca.cs.technion.ac.il/book Numerical geometry of non-rigid shapes Stanford University, Winter 2009 2 Longest Slowest Shortest Minimal Maximal

### ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications Professor M. Chiang Electrical Engineering Department, Princeton University March

### Lecture 1: Introduction. Outline. B9824 Foundations of Optimization. Fall Administrative matters. 2. Introduction. 3. Existence of optima

B9824 Foundations of Optimization Lecture 1: Introduction Fall 2010 Copyright 2010 Ciamac Moallemi Outline 1. Administrative matters 2. Introduction 3. Existence of optima 4. Local theory of unconstrained

### Finite Dimensional Optimization Part I: The KKT Theorem 1

John Nachbar Washington University March 26, 2018 1 Introduction Finite Dimensional Optimization Part I: The KKT Theorem 1 These notes characterize maxima and minima in terms of first derivatives. I focus

### Soft-margin SVM can address linearly separable problems with outliers

Non-linear Support Vector Machines Non-linearly separable problems Hard-margin SVM can address linearly separable problems Soft-margin SVM can address linearly separable problems with outliers Non-linearly

### Lecture 10: A brief introduction to Support Vector Machine

Lecture 10: A brief introduction to Support Vector Machine Advanced Applied Multivariate Analysis STAT 2221, Fall 2013 Sungkyu Jung Department of Statistics, University of Pittsburgh Xingye Qiao Department

### Optimization methods

Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

### The Kuhn-Tucker Problem

Natalia Lazzati Mathematics for Economics (Part I) Note 8: Nonlinear Programming - The Kuhn-Tucker Problem Note 8 is based on de la Fuente (2000, Ch. 7) and Simon and Blume (1994, Ch. 18 and 19). The Kuhn-Tucker

### Nonlinear Programming (NLP)

Natalia Lazzati Mathematics for Economics (Part I) Note 6: Nonlinear Programming - Unconstrained Optimization Note 6 is based on de la Fuente (2000, Ch. 7), Madden (1986, Ch. 3 and 5) and Simon and Blume

### Review of Optimization Basics

Review of Optimization Basics. Introduction Electricity markets throughout the US are said to have a two-settlement structure. The reason for this is that the structure includes two different markets:

### Jørgen Tind, Department of Statistics and Operations Research, University of Copenhagen, Universitetsparken 5, 2100 Copenhagen O, Denmark.

DUALITY THEORY Jørgen Tind, Department of Statistics and Operations Research, University of Copenhagen, Universitetsparken 5, 2100 Copenhagen O, Denmark. Keywords: Duality, Saddle point, Complementary

### ELE539A: Optimization of Communication Systems Lecture 6: Quadratic Programming, Geometric Programming, and Applications

ELE539A: Optimization of Communication Systems Lecture 6: Quadratic Programming, Geometric Programming, and Applications Professor M. Chiang Electrical Engineering Department, Princeton University February

### SUPPORT VECTOR MACHINE FOR THE SIMULTANEOUS APPROXIMATION OF A FUNCTION AND ITS DERIVATIVE

SUPPORT VECTOR MACHINE FOR THE SIMULTANEOUS APPROXIMATION OF A FUNCTION AND ITS DERIVATIVE M. Lázaro 1, I. Santamaría 2, F. Pérez-Cruz 1, A. Artés-Rodríguez 1 1 Departamento de Teoría de la Señal y Comunicaciones

### 8. Geometric problems

8. Geometric problems Convex Optimization Boyd & Vandenberghe extremal volume ellipsoids centering classification placement and facility location 8 1 Minimum volume ellipsoid around a set Löwner-John ellipsoid

### SMO Algorithms for Support Vector Machines without Bias Term

Institute of Automatic Control Laboratory for Control Systems and Process Automation Prof. Dr.-Ing. Dr. h. c. Rolf Isermann SMO Algorithms for Support Vector Machines without Bias Term Michael Vogt, 18-Jul-2002

### A Review of Linear Programming

A Review of Linear Programming Instructor: Farid Alizadeh IEOR 4600y Spring 2001 February 14, 2001 1 Overview In this note we review the basic properties of linear programming including the primal simplex

### Rank-one LMIs and Lyapunov's Inequality. Gjerrit Meinsma 4. Abstract. We describe a new proof of the well-known Lyapunov's matrix inequality about

Rank-one LMIs and Lyapunov's Inequality Didier Henrion 1;; Gjerrit Meinsma Abstract We describe a new proof of the well-known Lyapunov's matrix inequality about the location of the eigenvalues of a matrix

### Agenda. Interior Point Methods. 1 Barrier functions. 2 Analytic center. 3 Central path. 4 Barrier method. 5 Primal-dual path following algorithms

Agenda Interior Point Methods 1 Barrier functions 2 Analytic center 3 Central path 4 Barrier method 5 Primal-dual path following algorithms 6 Nesterov Todd scaling 7 Complexity analysis Interior point

### A Polyhedral Cone Counterexample

Division of the Humanities and Social Sciences A Polyhedral Cone Counterexample KCB 9/20/2003 Revised 12/12/2003 Abstract This is an example of a pointed generating convex cone in R 4 with 5 extreme rays,

### 15. Conic optimization

L. Vandenberghe EE236C (Spring 216) 15. Conic optimization conic linear program examples modeling duality 15-1 Generalized (conic) inequalities Conic inequality: a constraint x K where K is a convex cone

### 1 Convexity, concavity and quasi-concavity. (SB )

UNIVERSITY OF MARYLAND ECON 600 Summer 2010 Lecture Two: Unconstrained Optimization 1 Convexity, concavity and quasi-concavity. (SB 21.1-21.3.) For any two points, x, y R n, we can trace out the line of

### ORIE 6300 Mathematical Programming I August 25, Lecture 2

ORIE 6300 Mathematical Programming I August 25, 2016 Lecturer: Damek Davis Lecture 2 Scribe: Johan Bjorck Last time, we considered the dual of linear programs in our basic form: max(c T x : Ax b). We also

### HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.

HW1 solutions Exercise 1 (Some sets of probability distributions.) Let x be a real-valued random variable with Prob(x = a i ) = p i, i = 1,..., n, where a 1 < a 2 < < a n. Of course p R n lies in the standard

### Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5

Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5 Instructor: Farid Alizadeh Scribe: Anton Riabov 10/08/2001 1 Overview We continue studying the maximum eigenvalue SDP, and generalize

### Dual and primal-dual methods

ELE 538B: Large-Scale Optimization for Data Science Dual and primal-dual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primal-dual proximal gradient method

### Uniqueness of the SVM Solution

Uniqueness of the SVM Solution Christopher J.C. Burges Advanced Technologies, Bell Laboratories, Lucent Technologies Holmdel, New Jersey burges@iucent.com David J. Crisp Centre for Sensor Signal and Information