KarushKuhnTucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36725


 Benedict Dixon
 1 years ago
 Views:
Transcription
1 KarushKuhnTucker Conditions Lecturer: Ryan Tibshirani Convex Optimization /
2 Given a minimization problem Last time: duality min x subject to f(x) h i (x) 0, i = 1,... m l j (x) = 0, j = 1,... r we defined the Lagrangian: L(x, u, v) = f(x) + and Lagrange dual function: m u i h i (x) + g(u, v) = min x L(x, u, v) r v j l j (x) j=1 2
3 The subsequent dual problem is: Important properties: max u,v g(u, v) subject to u 0 Dual problem is always convex, i.e., g is always concave (even if primal problem is not convex) The primal and dual optimal values, f and g, always satisfy weak duality: f g Slater s condition: for convex primal, if there is an x such that h 1 (x) < 0,... h m (x) < 0 and l 1 (x) = 0,... l r (x) = 0 then strong duality holds: f = g. Can be further refined to strict inequalities over the nonaffine h i, i = 1,... m 3
4 Outline Today: KKT conditions Examples Constrained and Lagrange forms Uniqueness with l 1 penalties 4
5 Given general problem KarushKuhnTucker conditions min x subject to f(x) h i (x) 0, i = 1,... m l j (x) = 0, j = 1,... r The KarushKuhnTucker conditions or KKT conditions are: ( m r ) 0 f(x) + u i h i (x) + v j l j (x) (stationarity) u i h i (x) = 0 for all i j=1 h i (x) 0, l j (x) = 0 for all i, j u i 0 for all i (complementary slackness) (primal feasibility) (dual feasibility) 5
6 Necessity Let x and u, v be primal and dual solutions with zero duality gap (strong duality holds, e.g., under Slater s condition). Then f(x ) = g(u, v ) = min x f(x) + f(x ) + f(x ) m u i h i (x) + m u i h i (x ) + r vj l j (x) j=1 r vj l j (x ) j=1 In other words, all these inequalities are actually equalities 6
7 Two things to learn from this: The point x minimizes L(x, u, v ) over x R n. Hence the subdifferential of L(x, u, v ) must contain 0 at x = x this is exactly the stationarity condition We must have m u i h i(x ) = 0, and since each term here is 0, this implies u i h i(x ) = 0 for every i this is exactly complementary slackness Primal and dual feasibility hold by virtue of optimality. Therefore: If x and u, v are primal and dual solutions, with zero duality gap, then x, u, v satisfy the KKT conditions (Note that this statement assumes nothing a priori about convexity of our problem, i.e., of f, h i, l j ) 7
8 Sufficiency If there exists x, u, v that satisfy the KKT conditions, then g(u, v ) = f(x ) + = f(x ) m u i h i (x ) + r vj l j (x ) j=1 where the first equality holds from stationarity, and the second holds from complementary slackness Therefore the duality gap is zero (and x and u, v are primal and dual feasible) so x and u, v are primal and dual optimal. Hence, we ve shown: If x and u, v satisfy the KKT conditions, then x and u, v are primal and dual solutions 8
9 Putting it together In summary, KKT conditions: always sufficient necessary under strong duality Putting it together: For a problem with strong duality (e.g., assume Slater s condition: convex problem and there exists x strictly satisfying nonaffine inequality contraints), x and u, v are primal and dual solutions x and u, v satisfy the KKT conditions (Warning, concerning the stationarity condition: for a differentiable function f, we cannot use f(x) = { f(x)} unless f is convex!) 9
10 10 What s in a name? Older folks will know these as the KT (KuhnTucker) conditions: First appeared in publication by Kuhn and Tucker in 1951 Later people found out that Karush had the conditions in his unpublished master s thesis of 1939 For unconstrained problems, the KKT conditions are nothing more than the subgradient optimality condition For general convex problems, the KKT conditions could have been derived entirely from studying optimality via subgradients 0 f(x ) + m N {hi 0}(x ) + r N {lj =0}(x ) j=1 where recall N C (x) is the normal cone of C at x
11 11 Example: quadratic with equality constraints Consider for Q 0, min 2 xt Qx + c T x subject to Ax = 0 x 1 E.g., as we will see, this corresponds to Newton step for equalityconstrained problem min x f(x) subject to Ax = b Convex problem, no inequality constraints, so by KKT conditions: x is a solution if and only if [ ] ] Q A T A 0 ] [ x u = [ c 0 for some u. Linear system combines stationarity, primal feasibility (complementary slackness and dual feasibility are vacuous)
12 12 Example: waterfilling Example from B & V page 245: consider problem min x n log(α i + x i ) subject to x 0, 1 T x = 1 Information theory: think of log(α i + x i ) as communication rate of ith channel. KKT conditions: 1/(α i + x i ) u i + v = 0, i = 1,... n Eliminate u: u i x i = 0, i = 1,... n, x 0, 1 T x = 1, u 0 1/(α i + x i ) v, i = 1,... n x i (v 1/(α i + x i )) = 0, i = 1,... n, x 0, 1 T x = 1
13 13 Can argue directly stationarity and complementary slackness imply { 1/v α i if v < 1/α i x i = = max{0, 1/v α i }, i = 1,... n 0 if v 1/α i Still need x to be feasible, i.e., 1 T x = 1, and this gives n max{0, 1/v α i } = 1 Univariate equation, piecewise linear in 1/v and not hard to solve This reduced problem is called waterfilling (From B & V page 246) 1/ν x i α i i
14 14 Example: support vector machines Given y { 1, 1} n, and X R n p, the support vector machine problem is: min β,β 0,ξ subject to 1 2 β C n ξ i ξ i 0, i = 1,... n y i (x T i β + β 0 ) 1 ξ i, i = 1,... n Introduce dual variables v, w 0. KKT stationarity condition: n 0 = w i y i, β = Complementary slackness: n w i y i x i, w = C1 v v i ξ i = 0, w i ( 1 ξi y i (x T i β + β 0 ) ) = 0, i = 1,... n
15 15 Hence at optimality we have β = n w iy i x i, and w i is nonzero only if y i (x T i β + β 0) = 1 ξ i. Such points i are called the support points For support point i, if ξ i = 0, then x i lies on edge of margin, and w i (0, C]; For support point i, if ξ i 0, then x i lies on wrong side of margin, and w i = C x T β + β 0 =0 ξ 4 ξ5 ξ ξ 3 1 ξ 2 M = 1 β M = 1 β margin KKT conditions do not really give us a way to find solution, but gives a better understanding In fact, we can use this to screen away nonsupport points before performing optimization
16 16 Constrained and Lagrange forms Often in statistics and machine learning we ll switch back and forth between constrained form, where t R is a tuning parameter, min x f(x) subject to h(x) t (C) and Lagrange form, where λ 0 is a tuning parameter, min x f(x) + λ h(x) (L) and claim these are equivalent. Is this true (assuming convex f, h)? (C) to (L): if problem (C) is strictly feasible, then strong duality holds, and there exists some λ 0 (dual solution) such that any solution x in (C) minimizes so x is also a solution in (L) f(x) + λ (h(x) t)
17 17 (L) to (C): if x is a solution in (L), then the KKT conditions for (C) are satisfied by taking t = h(x ), so x is a solution in (C) Conclusion: {solutions in (L)} λ 0 λ 0 {solutions in (L)} {solutions in (C)} t {solutions in (C)} t such that (C) is strictly feasible This is nearly a perfect equivalence. Note: when the only value of t that leads to a feasible but not strictly feasible constraint set is t = 0, then we do get perfect equivalence So, e.g., if h 0, and (C), (L) are feasible for all t, λ 0, then we do get perfect equivalence
18 Uniqueness in l 1 penalized problems Using the KKT conditions and simple probability arguments, we have the following (perhaps surprising) result: Theorem: Let f be differentiable and strictly convex, let X R n p, λ > 0. Consider min β f(xβ) + λ β 1 If the entries of X are drawn from a continuous probability distribution (on R np ), then w.p. 1 there is a unique solution and it has at most min{n, p} nonzero components Remark: here f must be strictly convex, but no restrictions on the dimensions of X (we could have p n) Proof: the KKT conditions are { X T {sign(β i )} if β i 0 f(xβ) = λs, s i [ 1, 1] if β i = 0, i = 1,... n 18
19 19 Note that Xβ, s are unique. Define S = {j : Xj T f(xβ) = λ}, also unique, and note that any solution satisfies β i = 0 for all i / S First assume that rank(x S ) < S (here X R n S, submatrix of X corresponding to columns in S). Then for some i S, X i = for constants c j R, so that s i X i = j S\{i} j S\{i} c j X j s j c j λ(s j X j ) Hence taking an inner product with f(xβ), λ = j S\{i} (s i s j c j )λ, i.e., j S\{i} s i s j c j = 1
20 20 In other words, we ve proved that rank(x S ) < S implies s i X i is in the affine span of s j X j, j S \ {i} (subspace of dimension < n) We say that the matrix X has columns in general position if any affine subspace L of dimension k < n does not contain more than k + 1 elements; of {±X 1,... ± X p } (excluding antipodal pairs) It is straightforward to show that, if the entries of X have a density over R np, then X is in general position with probability 1 X 3 X 4 X 2 X 1
21 21 Therefore, if entries of X are drawn from continuous probability distribution, any solution must satisfy rank(x S ) = S Recalling the KKT conditions, this means the number of nonzero components in any solution at most S min{n, p}. Further, we can reduce our optimization problem (by partially solving) to min β S R S f(x S β S ) + λ β S 1 Finally, strict convexity implies uniqueness of the solution in this problem, and hence in our original problem
22 22 Back to duality One of the most important uses of duality is that, under strong duality, we can characterize primal solutions from dual solutions Recall that under strong duality, the KKT conditions are necessary for optimality. Given dual solutions u, v, any primal solution x satisfies the stationarity condition 0 f(x ) + m u i h i (x ) + In other words, x solves min x L(x, u, v ) r vi l j (x ) j=1 Generally, this reveals a characterization of primal solutions In particular, if this is satisfied uniquely (i.e., above problem has a unique minimizer), then the corresponding point must be the primal solution
23 23 References S. Boyd and L. Vandenberghe (2004), Convex optimization, Chapter 5 R. T. Rockafellar (1970), Convex analysis, Chapters 28 30
Lecture 6: Conic Optimization September 8
IE 598: Big Data Optimization Fall 2016 Lecture 6: Conic Optimization September 8 Lecturer: Niao He Scriber: Juan Xu Overview In this lecture, we finish up our previous discussion on optimality conditions
More informationOptimization for Machine Learning
Optimization for Machine Learning (Problems; Algorithms  A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html
More information14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.
CS/ECE/ISyE 524 Introduction to Optimization Spring 2016 17 14. Duality ˆ Upper and lower bounds ˆ General duality ˆ Constraint qualifications ˆ Counterexample ˆ Complementary slackness ˆ Examples ˆ Sensitivity
More informationOn the Method of Lagrange Multipliers
On the Method of Lagrange Multipliers Reza Nasiri Mahalati November 6, 2016 Most of what is in this note is taken from the Convex Optimization book by Stephen Boyd and Lieven Vandenberghe. This should
More informationLecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem
Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 00 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R
More informationThe KarushKuhnTucker (KKT) conditions
The KarushKuhnTucker (KKT) conditions In this section, we will give a set of sufficient (and at most times necessary) conditions for a x to be the solution of a given convex optimization problem. These
More informationMotivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:
CDS270 Maryam Fazel Lecture 2 Topics from Optimization and Duality Motivation network utility maximization (NUM) problem: consider a network with S sources (users), each sending one flow at rate x s, through
More informationOptimality, Duality, Complementarity for Constrained Optimization
Optimality, Duality, Complementarity for Constrained Optimization Stephen Wright University of WisconsinMadison May 2014 Wright (UWMadison) Optimality, Duality, Complementarity May 2014 1 / 41 Linear
More informationConvex Optimization Overview (cnt d)
Conve Optimization Overview (cnt d) Chuong B. Do November 29, 2009 During last week s section, we began our study of conve optimization, the study of mathematical optimization problems of the form, minimize
More informationConvex Optimization and Modeling
Convex Optimization and Modeling Duality Theory and Optimality Conditions 5th lecture, 12.05.2010 Jun.Prof. Matthias Hein Program of today/next lecture Lagrangian and duality: the Lagrangian the dual
More informationConvex Optimization Lecture 6: KKT Conditions, and applications
Convex Optimization Lecture 6: KKT Conditions, and applications Dr. Michel Baes, IFOR / ETH Zürich Quick recall of last week s lecture Various aspects of convexity: The set of minimizers is convex. Convex
More informationDual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36725
Dual methods and ADMM Barnabas Poczos & Ryan Tibshirani Convex Optimization 10725/36725 1 Given f : R n R, the function is called its conjugate Recall conjugate functions f (y) = max x R n yt x f(x)
More informationCoordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /
Coordinate descent Geoff Gordon & Ryan Tibshirani Optimization 10725 / 36725 1 Adding to the toolbox, with stats and ML in mind We ve seen several general and useful minimization tools Firstorder methods
More informationChap 2. Optimality conditions
Chap 2. Optimality conditions Version: 29092012 2.1 Optimality conditions in unconstrained optimization Recall the definitions of global, local minimizer. Geometry of minimization Consider for f C 1
More information10701 Recitation 5 Duality and SVM. Ahmed Hefny
10701 Recitation 5 Duality and SVM Ahmed Hefny Outline Langrangian and Duality The Lagrangian Duality Eamples Support Vector Machines Primal Formulation Dual Formulation Soft Margin and Hinge Loss Lagrangian
More informationSparsity and the Lasso
Sparsity and the Lasso Statistical Machine Learning, Spring 205 Ryan Tibshirani (with Larry Wasserman Regularization and the lasso. A bit of background If l 2 was the norm of the 20th century, then l is
More informationSupport Vector Machines
EE 17/7AT: Optimization Models in Engineering Section 11/1  April 014 Support Vector Machines Lecturer: Arturo Fernandez Scribe: Arturo Fernandez 1 Support Vector Machines Revisited 1.1 Strictly) Separable
More informationNewton s Method. Ryan Tibshirani Convex Optimization /36725
Newton s Method Ryan Tibshirani Convex Optimization 10725/36725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, Properties and examples: f (y) = max x
More informationSupport Vector Machine (SVM) and Kernel Methods
Support Vector Machine (SVM) and Kernel Methods CE717: Machine Learning Sharif University of Technology Fall 2015 Soleymani Outline Margin concept HardMargin SVM SoftMargin SVM Dual Problems of HardMargin
More informationSupport Vector Machines and Kernel Methods
2018 CS420 Machine Learning, Lecture 3 Hangout from Prof. Andrew Ng. http://cs229.stanford.edu/notes/cs229notes3.pdf Support Vector Machines and Kernel Methods Weinan Zhang Shanghai Jiao Tong University
More informationLecture 8. Strong Duality Results. September 22, 2008
Strong Duality Results September 22, 2008 Outline Lecture 8 Slater Condition and its Variations Convex Objective with Linear Inequality Constraints Quadratic Objective over Quadratic Constraints Representation
More informationLecture 23: November 21
10725/36725: Convex Optimization Fall 2016 Lecturer: Ryan Tibshirani Lecture 23: November 21 Scribes: Yifan Sun, Ananya Kumar, Xin Lu Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:
More informationQuiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006
Quiz Discussion IE417: Nonlinear Programming: Lecture 12 Jeff Linderoth Department of Industrial and Systems Engineering Lehigh University 16th March 2006 Motivation Why do we care? We are interested in
More informationPrimalDual InteriorPoint Methods. Ryan Tibshirani Convex Optimization /36725
PrimalDual InteriorPoint Methods Ryan Tibshirani Convex Optimization 10725/36725 Given the problem Last time: barrier method min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h i, i = 1,...
More informationSupport Vector Machines
Wien, June, 2010 Paul Hofmarcher, Stefan Theussl, WU Wien Hofmarcher/Theussl SVM 1/21 Linear Separable Separating Hyperplanes NonLinear Separable SoftMargin Hyperplanes Hofmarcher/Theussl SVM 2/21 (SVM)
More informationNewton s Method. Javier Peña Convex Optimization /36725
Newton s Method Javier Peña Convex Optimization 10725/36725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, f ( (y) = max y T x f(x) ) x Properties and
More informationDate: July 5, Contents
2 Lagrange Multipliers Date: July 5, 2001 Contents 2.1. Introduction to Lagrange Multipliers......... p. 2 2.2. Enhanced Fritz John Optimality Conditions...... p. 14 2.3. Informative Lagrange Multipliers...........
More informationTutorial on Convex Optimization for Engineers Part II
Tutorial on Convex Optimization for Engineers Part II M.Sc. Jens Steinwandt Communications Research Laboratory Ilmenau University of Technology PO Box 100565 D98684 Ilmenau, Germany jens.steinwandt@tuilmenau.de
More informationConditional Gradient (FrankWolfe) Method
Conditional Gradient (FrankWolfe) Method Lecturer: Aarti Singh Coinstructor: Pradeep Ravikumar Convex Optimization 10725/36725 1 Outline Today: Conditional gradient method Convergence analysis Properties
More informationSECTION C: CONTINUOUS OPTIMISATION LECTURE 9: FIRST ORDER OPTIMALITY CONDITIONS FOR CONSTRAINED NONLINEAR PROGRAMMING
Nf SECTION C: CONTINUOUS OPTIMISATION LECTURE 9: FIRST ORDER OPTIMALITY CONDITIONS FOR CONSTRAINED NONLINEAR PROGRAMMING f(x R m g HONOUR SCHOOL OF MATHEMATICS, OXFORD UNIVERSITY HILARY TERM 5, DR RAPHAEL
More informationLecture 1. 1 Conic programming. MA 796S: Convex Optimization and Interior Point Methods October 8, Consider the conic program. min.
MA 796S: Convex Optimization and Interior Point Methods October 8, 2007 Lecture 1 Lecturer: Kartik Sivaramakrishnan Scribe: Kartik Sivaramakrishnan 1 Conic programming Consider the conic program min s.t.
More informationLagrangian Duality for Dummies
Lagrangian Duality for Dummies David Knowles November 13, 2010 We want to solve the following optimisation problem: f 0 () (1) such that f i () 0 i 1,..., m (2) For now we do not need to assume conveity.
More informationLinear Programming: Simplex
Linear Programming: Simplex Stephen J. Wright 1 2 Computer Sciences Department, University of WisconsinMadison. IMA, August 2016 Stephen Wright (UWMadison) Linear Programming: Simplex IMA, August 2016
More informationApplied Lagrange Duality for Constrained Optimization
Applied Lagrange Duality for Constrained Optimization February 12, 2002 Overview The Practical Importance of Duality ffl Review of Convexity ffl A Separating Hyperplane Theorem ffl Definition of the Dual
More informationLecture Support Vector Machine (SVM) Classifiers
Introduction to Machine Learning Lecturer: Amir Globerson Lecture 6 Fall Semester Scribe: Yishay Mansour 6.1 Support Vector Machine (SVM) Classifiers Classification is one of the most important tasks in
More information10 Numerical methods for constrained problems
10 Numerical methods for constrained problems min s.t. f(x) h(x) = 0 (l), g(x) 0 (m), x X The algorithms can be roughly divided the following way: ˆ primal methods: find descent direction keeping inside
More informationLagrangian Duality. Richard Lusby. Department of Management Engineering Technical University of Denmark
Lagrangian Duality Richard Lusby Department of Management Engineering Technical University of Denmark Today s Topics (jg Lagrange Multipliers Lagrangian Relaxation Lagrangian Duality R Lusby (42111) Lagrangian
More informationLecture 5. Theorems of Alternatives and SelfDual Embedding
IE 8534 1 Lecture 5. Theorems of Alternatives and SelfDual Embedding IE 8534 2 A system of linear equations may not have a solution. It is well known that either Ax = c has a solution, or A T y = 0, c
More informationEE 227A: Convex Optimization and Applications October 14, 2008
EE 227A: Convex Optimization and Applications October 14, 2008 Lecture 13: SDP Duality Lecturer: Laurent El Ghaoui Reading assignment: Chapter 5 of BV. 13.1 Direct approach 13.1.1 Primal problem Consider
More informationPrimalDual InteriorPoint Methods
PrimalDual InteriorPoint Methods Lecturer: Aarti Singh Coinstructor: Pradeep Ravikumar Convex Optimization 10725/36725 Outline Today: Primaldual interiorpoint method Special case: linear programming
More informationAssignment 1: From the Definition of Convexity to Helley Theorem
Assignment 1: From the Definition of Convexity to Helley Theorem Exercise 1 Mark in the following list the sets which are convex: 1. {x R 2 : x 1 + i 2 x 2 1, i = 1,..., 10} 2. {x R 2 : x 2 1 + 2ix 1x
More informationLecture 13: Duality Uses and Correspondences
10725/36725: Conve Optimization Fall 2016 Lectre 13: Dality Uses and Correspondences Lectrer: Ryan Tibshirani Scribes: Yichong X, Yany Liang, Yanning Li Note: LaTeX template cortesy of UC Berkeley EECS
More informationSupport Vector Machines for Classification and Regression. 1 Linearly Separable Data: Hard Margin SVMs
E0 270 Machine Learning Lecture 5 (Jan 22, 203) Support Vector Machines for Classification and Regression Lecturer: Shivani Agarwal Disclaimer: These notes are a brief summary of the topics covered in
More informationSolving Dual Problems
Lecture 20 Solving Dual Problems We consider a constrained problem where, in addition to the constraint set X, there are also inequality and linear equality constraints. Specifically the minimization problem
More informationCONVEX FUNCTIONS AND OPTIMIZATION TECHINIQUES A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF
CONVEX FUNCTIONS AND OPTIMIZATION TECHINIQUES A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE IN MATHEMATICS SUBMITTED TO NATIONAL INSTITUTE OF TECHNOLOGY,
More informationThe fundamental theorem of linear programming
The fundamental theorem of linear programming Michael Tehranchi June 8, 2017 This note supplements the lecture notes of Optimisation The statement of the fundamental theorem of linear programming and the
More informationPrimalDual InteriorPoint Methods for Linear Programming based on Newton s Method
PrimalDual InteriorPoint Methods for Linear Programming based on Newton s Method Robert M. Freund March, 2004 2004 Massachusetts Institute of Technology. The Problem The logarithmic barrier approach
More informationMathematical Foundations 1 Constrained Optimization. Constrained Optimization. An intuitive approach 2. First Order Conditions (FOC) 7
Mathematical Foundations  Constrained Optimization Constrained Optimization An intuitive approach First Order Conditions (FOC) 7 Constraint qualifications 9 Formal statement of the FOC for a maximum
More informationLP Duality: outline. Duality theory for Linear Programming. alternatives. optimization I Idea: polyhedra
LP Duality: outline I Motivation and definition of a dual LP I Weak duality I Separating hyperplane theorem and theorems of the alternatives I Strong duality and complementary slackness I Using duality
More informationLecture 15 Newton Method and SelfConcordance. October 23, 2008
Newton Method and SelfConcordance October 23, 2008 Outline Lecture 15 Selfconcordance Notion Selfconcordant Functions Operations Preserving Selfconcordance Properties of Selfconcordant Functions Implications
More informationThe lasso. Patrick Breheny. February 15. The lasso Convex optimization Soft thresholding
Patrick Breheny February 15 Patrick Breheny HighDimensional Data Analysis (BIOS 7600) 1/24 Introduction Last week, we introduced penalized regression and discussed ridge regression, in which the penalty
More informationCONVEX OPTIMIZATION, DUALITY, AND THEIR APPLICATION TO SUPPORT VECTOR MACHINES. Contents 1. Introduction 1 2. Convex Sets
CONVEX OPTIMIZATION, DUALITY, AND THEIR APPLICATION TO SUPPORT VECTOR MACHINES DANIEL HENDRYCKS Abstract. This paper develops the fundamentals of convex optimization and applies them to Support Vector
More informationSupport Vector Machines
CS229 Lecture notes Andrew Ng Part V Support Vector Machines This set of notes presents the Support Vector Machine (SVM) learning algorithm. SVMs are among the best (and many believe is indeed the best)
More informationSymmetric and Asymmetric Duality
journal of mathematical analysis and applications 220, 125 131 (1998) article no. AY975824 Symmetric and Asymmetric Duality Massimo Pappalardo Department of Mathematics, Via Buonarroti 2, 56127, Pisa,
More informationYou should be able to...
Lecture Outline Gradient Projection Algorithm Constant Step Length, Varying Step Length, Diminishing Step Length Complexity Issues Gradient Projection With Exploration Projection Solving QPs: active set
More informationSTATIC LECTURE 4: CONSTRAINED OPTIMIZATION II  KUHN TUCKER THEORY
STATIC LECTURE 4: CONSTRAINED OPTIMIZATION II  KUHN TUCKER THEORY UNIVERSITY OF MARYLAND: ECON 600 1. Some Eamples 1 A general problem that arises countless times in economics takes the form: (Verbally):
More informationLAGRANGIAN TRANSFORMATION IN CONVEX OPTIMIZATION
LAGRANGIAN TRANSFORMATION IN CONVEX OPTIMIZATION ROMAN A. POLYAK Abstract. We introduce the Lagrangian Transformation(LT) and develop a general LT method for convex optimization problems. A class Ψ of
More informationDuality of LPs and Applications
Lecture 6 Duality of LPs and Applications Last lecture we introduced duality of linear programs. We saw how to form duals, and proved both the weak and strong duality theorems. In this lecture we will
More informationGaps in Support Vector Optimization
Gaps in Support Vector Optimization Nikolas List 1 (student author), Don Hush 2, Clint Scovel 2, Ingo Steinwart 2 1 Lehrstuhl Mathematik und Informatik, RuhrUniversity Bochum, Germany nlist@lmi.rub.de
More informationA Tutorial on Convex Optimization II: Duality and Interior Point Methods
A Tutorial on Convex Optimization II: Duality and Interior Point Methods Haitham Hindi Palo Alto Research Center (PARC), Palo Alto, California 94304 email: hhindi@parc.com Abstract In recent years, convex
More informationOptimization Tutorial 1. Basic Gradient Descent
E0 270 Machine Learning Jan 16, 2015 Optimization Tutorial 1 Basic Gradient Descent Lecture by Harikrishna Narasimhan Note: This tutorial shall assume background in elementary calculus and linear algebra.
More informationLecture 1: Introduction. Outline. B9824 Foundations of Optimization. Fall Administrative matters. 2. Introduction. 3. Existence of optima
B9824 Foundations of Optimization Lecture 1: Introduction Fall 2009 Copyright 2009 Ciamac Moallemi Outline 1. Administrative matters 2. Introduction 3. Existence of optima 4. Local theory of unconstrained
More informationLecture 2: Convex Sets and Functions
Lecture 2: Convex Sets and Functions HyangWon Lee Dept. of Internet & Multimedia Eng. Konkuk University Lecture 2 Network Optimization, Fall 2015 1 / 22 Optimization Problems Optimization problems are
More informationConstrained optimization: direct methods (cont.)
Constrained optimization: direct methods (cont.) Jussi Hakanen Postdoctoral researcher jussi.hakanen@jyu.fi Direct methods Also known as methods of feasible directions Idea in a point x h, generate a
More informationSolution Methods. Richard Lusby. Department of Management Engineering Technical University of Denmark
Solution Methods Richard Lusby Department of Management Engineering Technical University of Denmark Lecture Overview (jg Unconstrained Several Variables Quadratic Programming Separable Programming SUMT
More informationNonlinear Programming 3rd Edition. Theoretical Solutions Manual Chapter 6
Nonlinear Programming 3rd Edition Theoretical Solutions Manual Chapter 6 Dimitri P. Bertsekas Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts 1 NOTE This manual contains
More informationIE 5531 Midterm #2 Solutions
IE 5531 Midterm #2 s Prof. John Gunnar Carlsson November 9, 2011 Before you begin: This exam has 9 pages and a total of 5 problems. Make sure that all pages are present. To obtain credit for a problem,
More informationModule 04 Optimization Problems KKT Conditions & Solvers
Module 04 Optimization Problems KKT Conditions & Solvers Ahmad F. Taha EE 5243: Introduction to CyberPhysical Systems Email: ahmad.taha@utsa.edu Webpage: http://engineering.utsa.edu/ taha/index.html September
More informationAdvanced Mathematical Programming IE417. Lecture 24. Dr. Ted Ralphs
Advanced Mathematical Programming IE417 Lecture 24 Dr. Ted Ralphs IE417 Lecture 24 1 Reading for This Lecture Sections 11.211.2 IE417 Lecture 24 2 The Linear Complementarity Problem Given M R p p and
More informationLecture 13: Constrained optimization
20101203 Basic ideas A nonlinearly constrained problem must somehow be converted relaxed into a problem which we can solve (a linear/quadratic or unconstrained problem) We solve a sequence of such problems
More informationLecture 17: Primaldual interiorpoint methods part II
10725/36725: Convex Optimization Spring 2015 Lecture 17: Primaldual interiorpoint methods part II Lecturer: Javier Pena Scribes: Pinchao Zhang, Wei Ma Note: LaTeX template courtesy of UC Berkeley EECS
More informationLECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE
LECTURE 25: REVIEW/EPILOGUE LECTURE OUTLINE CONVEX ANALYSIS AND DUALITY Basic concepts of convex analysis Basic concepts of convex optimization Geometric duality framework  MC/MC Constrained optimization
More informationNumerical optimization
Numerical optimization Lecture 4 Alexander & Michael Bronstein tosca.cs.technion.ac.il/book Numerical geometry of nonrigid shapes Stanford University, Winter 2009 2 Longest Slowest Shortest Minimal Maximal
More informationELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications
ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications Professor M. Chiang Electrical Engineering Department, Princeton University March
More informationLecture 1: Introduction. Outline. B9824 Foundations of Optimization. Fall Administrative matters. 2. Introduction. 3. Existence of optima
B9824 Foundations of Optimization Lecture 1: Introduction Fall 2010 Copyright 2010 Ciamac Moallemi Outline 1. Administrative matters 2. Introduction 3. Existence of optima 4. Local theory of unconstrained
More informationFinite Dimensional Optimization Part I: The KKT Theorem 1
John Nachbar Washington University March 26, 2018 1 Introduction Finite Dimensional Optimization Part I: The KKT Theorem 1 These notes characterize maxima and minima in terms of first derivatives. I focus
More informationSoftmargin SVM can address linearly separable problems with outliers
Nonlinear Support Vector Machines Nonlinearly separable problems Hardmargin SVM can address linearly separable problems Softmargin SVM can address linearly separable problems with outliers Nonlinearly
More informationLecture 10: A brief introduction to Support Vector Machine
Lecture 10: A brief introduction to Support Vector Machine Advanced Applied Multivariate Analysis STAT 2221, Fall 2013 Sungkyu Jung Department of Statistics, University of Pittsburgh Xingye Qiao Department
More informationOptimization methods
Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on firstorder information,
More informationThe KuhnTucker Problem
Natalia Lazzati Mathematics for Economics (Part I) Note 8: Nonlinear Programming  The KuhnTucker Problem Note 8 is based on de la Fuente (2000, Ch. 7) and Simon and Blume (1994, Ch. 18 and 19). The KuhnTucker
More informationNonlinear Programming (NLP)
Natalia Lazzati Mathematics for Economics (Part I) Note 6: Nonlinear Programming  Unconstrained Optimization Note 6 is based on de la Fuente (2000, Ch. 7), Madden (1986, Ch. 3 and 5) and Simon and Blume
More informationReview of Optimization Basics
Review of Optimization Basics. Introduction Electricity markets throughout the US are said to have a twosettlement structure. The reason for this is that the structure includes two different markets:
More informationJørgen Tind, Department of Statistics and Operations Research, University of Copenhagen, Universitetsparken 5, 2100 Copenhagen O, Denmark.
DUALITY THEORY Jørgen Tind, Department of Statistics and Operations Research, University of Copenhagen, Universitetsparken 5, 2100 Copenhagen O, Denmark. Keywords: Duality, Saddle point, Complementary
More informationELE539A: Optimization of Communication Systems Lecture 6: Quadratic Programming, Geometric Programming, and Applications
ELE539A: Optimization of Communication Systems Lecture 6: Quadratic Programming, Geometric Programming, and Applications Professor M. Chiang Electrical Engineering Department, Princeton University February
More informationSUPPORT VECTOR MACHINE FOR THE SIMULTANEOUS APPROXIMATION OF A FUNCTION AND ITS DERIVATIVE
SUPPORT VECTOR MACHINE FOR THE SIMULTANEOUS APPROXIMATION OF A FUNCTION AND ITS DERIVATIVE M. Lázaro 1, I. Santamaría 2, F. PérezCruz 1, A. ArtésRodríguez 1 1 Departamento de Teoría de la Señal y Comunicaciones
More information8. Geometric problems
8. Geometric problems Convex Optimization Boyd & Vandenberghe extremal volume ellipsoids centering classification placement and facility location 8 1 Minimum volume ellipsoid around a set LöwnerJohn ellipsoid
More informationSMO Algorithms for Support Vector Machines without Bias Term
Institute of Automatic Control Laboratory for Control Systems and Process Automation Prof. Dr.Ing. Dr. h. c. Rolf Isermann SMO Algorithms for Support Vector Machines without Bias Term Michael Vogt, 18Jul2002
More informationA Review of Linear Programming
A Review of Linear Programming Instructor: Farid Alizadeh IEOR 4600y Spring 2001 February 14, 2001 1 Overview In this note we review the basic properties of linear programming including the primal simplex
More informationRankone LMIs and Lyapunov's Inequality. Gjerrit Meinsma 4. Abstract. We describe a new proof of the wellknown Lyapunov's matrix inequality about
Rankone LMIs and Lyapunov's Inequality Didier Henrion 1;; Gjerrit Meinsma Abstract We describe a new proof of the wellknown Lyapunov's matrix inequality about the location of the eigenvalues of a matrix
More informationAgenda. Interior Point Methods. 1 Barrier functions. 2 Analytic center. 3 Central path. 4 Barrier method. 5 Primaldual path following algorithms
Agenda Interior Point Methods 1 Barrier functions 2 Analytic center 3 Central path 4 Barrier method 5 Primaldual path following algorithms 6 Nesterov Todd scaling 7 Complexity analysis Interior point
More informationA Polyhedral Cone Counterexample
Division of the Humanities and Social Sciences A Polyhedral Cone Counterexample KCB 9/20/2003 Revised 12/12/2003 Abstract This is an example of a pointed generating convex cone in R 4 with 5 extreme rays,
More information15. Conic optimization
L. Vandenberghe EE236C (Spring 216) 15. Conic optimization conic linear program examples modeling duality 151 Generalized (conic) inequalities Conic inequality: a constraint x K where K is a convex cone
More information1 Convexity, concavity and quasiconcavity. (SB )
UNIVERSITY OF MARYLAND ECON 600 Summer 2010 Lecture Two: Unconstrained Optimization 1 Convexity, concavity and quasiconcavity. (SB 21.121.3.) For any two points, x, y R n, we can trace out the line of
More informationORIE 6300 Mathematical Programming I August 25, Lecture 2
ORIE 6300 Mathematical Programming I August 25, 2016 Lecturer: Damek Davis Lecture 2 Scribe: Johan Bjorck Last time, we considered the dual of linear programs in our basic form: max(c T x : Ax b). We also
More informationHW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.
HW1 solutions Exercise 1 (Some sets of probability distributions.) Let x be a realvalued random variable with Prob(x = a i ) = p i, i = 1,..., n, where a 1 < a 2 < < a n. Of course p R n lies in the standard
More informationSemidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5
Semidefinite and Second Order Cone Programming Seminar Fall 2001 Lecture 5 Instructor: Farid Alizadeh Scribe: Anton Riabov 10/08/2001 1 Overview We continue studying the maximum eigenvalue SDP, and generalize
More informationDual and primaldual methods
ELE 538B: LargeScale Optimization for Data Science Dual and primaldual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primaldual proximal gradient method
More informationUniqueness of the SVM Solution
Uniqueness of the SVM Solution Christopher J.C. Burges Advanced Technologies, Bell Laboratories, Lucent Technologies Holmdel, New Jersey burges@iucent.com David J. Crisp Centre for Sensor Signal and Information
More informationLecture: Cone programming. Approximating the Lorentz cone.
Strong relaxations for discrete optimization problems 10/05/16 Lecture: Cone programming. Approximating the Lorentz cone. Lecturer: Yuri Faenza Scribes: Igor Malinović 1 Introduction Cone programming is
More informationMingbin Feng, John E. Mitchell, JongShi Pang, Xin Shen, Andreas Wächter
Complementarity Formulations of l 0 norm Optimization Problems 1 Mingbin Feng, John E. Mitchell, JongShi Pang, Xin Shen, Andreas Wächter Abstract: In a number of application areas, it is desirable to
More information