KarushKuhnTucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36725


 Benedict Dixon
 1 years ago
 Views:
Transcription
1 KarushKuhnTucker Conditions Lecturer: Ryan Tibshirani Convex Optimization /
2 Given a minimization problem Last time: duality min x subject to f(x) h i (x) 0, i = 1,... m l j (x) = 0, j = 1,... r we defined the Lagrangian: L(x, u, v) = f(x) + and Lagrange dual function: m u i h i (x) + g(u, v) = min x L(x, u, v) r v j l j (x) j=1 2
3 The subsequent dual problem is: Important properties: max u,v g(u, v) subject to u 0 Dual problem is always convex, i.e., g is always concave (even if primal problem is not convex) The primal and dual optimal values, f and g, always satisfy weak duality: f g Slater s condition: for convex primal, if there is an x such that h 1 (x) < 0,... h m (x) < 0 and l 1 (x) = 0,... l r (x) = 0 then strong duality holds: f = g. Can be further refined to strict inequalities over the nonaffine h i, i = 1,... m 3
4 Outline Today: KKT conditions Examples Constrained and Lagrange forms Uniqueness with l 1 penalties 4
5 Given general problem KarushKuhnTucker conditions min x subject to f(x) h i (x) 0, i = 1,... m l j (x) = 0, j = 1,... r The KarushKuhnTucker conditions or KKT conditions are: ( m r ) 0 f(x) + u i h i (x) + v j l j (x) (stationarity) u i h i (x) = 0 for all i j=1 h i (x) 0, l j (x) = 0 for all i, j u i 0 for all i (complementary slackness) (primal feasibility) (dual feasibility) 5
6 Necessity Let x and u, v be primal and dual solutions with zero duality gap (strong duality holds, e.g., under Slater s condition). Then f(x ) = g(u, v ) = min x f(x) + f(x ) + f(x ) m u i h i (x) + m u i h i (x ) + r vj l j (x) j=1 r vj l j (x ) j=1 In other words, all these inequalities are actually equalities 6
7 Two things to learn from this: The point x minimizes L(x, u, v ) over x R n. Hence the subdifferential of L(x, u, v ) must contain 0 at x = x this is exactly the stationarity condition We must have m u i h i(x ) = 0, and since each term here is 0, this implies u i h i(x ) = 0 for every i this is exactly complementary slackness Primal and dual feasibility hold by virtue of optimality. Therefore: If x and u, v are primal and dual solutions, with zero duality gap, then x, u, v satisfy the KKT conditions (Note that this statement assumes nothing a priori about convexity of our problem, i.e., of f, h i, l j ) 7
8 Sufficiency If there exists x, u, v that satisfy the KKT conditions, then g(u, v ) = f(x ) + = f(x ) m u i h i (x ) + r vj l j (x ) j=1 where the first equality holds from stationarity, and the second holds from complementary slackness Therefore the duality gap is zero (and x and u, v are primal and dual feasible) so x and u, v are primal and dual optimal. Hence, we ve shown: If x and u, v satisfy the KKT conditions, then x and u, v are primal and dual solutions 8
9 Putting it together In summary, KKT conditions: always sufficient necessary under strong duality Putting it together: For a problem with strong duality (e.g., assume Slater s condition: convex problem and there exists x strictly satisfying nonaffine inequality contraints), x and u, v are primal and dual solutions x and u, v satisfy the KKT conditions (Warning, concerning the stationarity condition: for a differentiable function f, we cannot use f(x) = { f(x)} unless f is convex!) 9
10 10 What s in a name? Older folks will know these as the KT (KuhnTucker) conditions: First appeared in publication by Kuhn and Tucker in 1951 Later people found out that Karush had the conditions in his unpublished master s thesis of 1939 For unconstrained problems, the KKT conditions are nothing more than the subgradient optimality condition For general convex problems, the KKT conditions could have been derived entirely from studying optimality via subgradients 0 f(x ) + m N {hi 0}(x ) + r N {lj =0}(x ) j=1 where recall N C (x) is the normal cone of C at x
11 11 Example: quadratic with equality constraints Consider for Q 0, min 2 xt Qx + c T x subject to Ax = 0 x 1 E.g., as we will see, this corresponds to Newton step for equalityconstrained problem min x f(x) subject to Ax = b Convex problem, no inequality constraints, so by KKT conditions: x is a solution if and only if [ ] ] Q A T A 0 ] [ x u = [ c 0 for some u. Linear system combines stationarity, primal feasibility (complementary slackness and dual feasibility are vacuous)
12 12 Example: waterfilling Example from B & V page 245: consider problem min x n log(α i + x i ) subject to x 0, 1 T x = 1 Information theory: think of log(α i + x i ) as communication rate of ith channel. KKT conditions: 1/(α i + x i ) u i + v = 0, i = 1,... n Eliminate u: u i x i = 0, i = 1,... n, x 0, 1 T x = 1, u 0 1/(α i + x i ) v, i = 1,... n x i (v 1/(α i + x i )) = 0, i = 1,... n, x 0, 1 T x = 1
13 13 Can argue directly stationarity and complementary slackness imply { 1/v α i if v < 1/α i x i = = max{0, 1/v α i }, i = 1,... n 0 if v 1/α i Still need x to be feasible, i.e., 1 T x = 1, and this gives n max{0, 1/v α i } = 1 Univariate equation, piecewise linear in 1/v and not hard to solve This reduced problem is called waterfilling (From B & V page 246) 1/ν x i α i i
14 14 Example: support vector machines Given y { 1, 1} n, and X R n p, the support vector machine problem is: min β,β 0,ξ subject to 1 2 β C n ξ i ξ i 0, i = 1,... n y i (x T i β + β 0 ) 1 ξ i, i = 1,... n Introduce dual variables v, w 0. KKT stationarity condition: n 0 = w i y i, β = Complementary slackness: n w i y i x i, w = C1 v v i ξ i = 0, w i ( 1 ξi y i (x T i β + β 0 ) ) = 0, i = 1,... n
15 15 Hence at optimality we have β = n w iy i x i, and w i is nonzero only if y i (x T i β + β 0) = 1 ξ i. Such points i are called the support points For support point i, if ξ i = 0, then x i lies on edge of margin, and w i (0, C]; For support point i, if ξ i 0, then x i lies on wrong side of margin, and w i = C x T β + β 0 =0 ξ 4 ξ5 ξ ξ 3 1 ξ 2 M = 1 β M = 1 β margin KKT conditions do not really give us a way to find solution, but gives a better understanding In fact, we can use this to screen away nonsupport points before performing optimization
16 16 Constrained and Lagrange forms Often in statistics and machine learning we ll switch back and forth between constrained form, where t R is a tuning parameter, min x f(x) subject to h(x) t (C) and Lagrange form, where λ 0 is a tuning parameter, min x f(x) + λ h(x) (L) and claim these are equivalent. Is this true (assuming convex f, h)? (C) to (L): if problem (C) is strictly feasible, then strong duality holds, and there exists some λ 0 (dual solution) such that any solution x in (C) minimizes so x is also a solution in (L) f(x) + λ (h(x) t)
17 17 (L) to (C): if x is a solution in (L), then the KKT conditions for (C) are satisfied by taking t = h(x ), so x is a solution in (C) Conclusion: {solutions in (L)} λ 0 λ 0 {solutions in (L)} {solutions in (C)} t {solutions in (C)} t such that (C) is strictly feasible This is nearly a perfect equivalence. Note: when the only value of t that leads to a feasible but not strictly feasible constraint set is t = 0, then we do get perfect equivalence So, e.g., if h 0, and (C), (L) are feasible for all t, λ 0, then we do get perfect equivalence
18 Uniqueness in l 1 penalized problems Using the KKT conditions and simple probability arguments, we have the following (perhaps surprising) result: Theorem: Let f be differentiable and strictly convex, let X R n p, λ > 0. Consider min β f(xβ) + λ β 1 If the entries of X are drawn from a continuous probability distribution (on R np ), then w.p. 1 there is a unique solution and it has at most min{n, p} nonzero components Remark: here f must be strictly convex, but no restrictions on the dimensions of X (we could have p n) Proof: the KKT conditions are { X T {sign(β i )} if β i 0 f(xβ) = λs, s i [ 1, 1] if β i = 0, i = 1,... n 18
19 19 Note that Xβ, s are unique. Define S = {j : Xj T f(xβ) = λ}, also unique, and note that any solution satisfies β i = 0 for all i / S First assume that rank(x S ) < S (here X R n S, submatrix of X corresponding to columns in S). Then for some i S, X i = for constants c j R, so that s i X i = j S\{i} j S\{i} c j X j s j c j λ(s j X j ) Hence taking an inner product with f(xβ), λ = j S\{i} (s i s j c j )λ, i.e., j S\{i} s i s j c j = 1
20 20 In other words, we ve proved that rank(x S ) < S implies s i X i is in the affine span of s j X j, j S \ {i} (subspace of dimension < n) We say that the matrix X has columns in general position if any affine subspace L of dimension k < n does not contain more than k + 1 elements; of {±X 1,... ± X p } (excluding antipodal pairs) It is straightforward to show that, if the entries of X have a density over R np, then X is in general position with probability 1 X 3 X 4 X 2 X 1
21 21 Therefore, if entries of X are drawn from continuous probability distribution, any solution must satisfy rank(x S ) = S Recalling the KKT conditions, this means the number of nonzero components in any solution at most S min{n, p}. Further, we can reduce our optimization problem (by partially solving) to min β S R S f(x S β S ) + λ β S 1 Finally, strict convexity implies uniqueness of the solution in this problem, and hence in our original problem
22 22 Back to duality One of the most important uses of duality is that, under strong duality, we can characterize primal solutions from dual solutions Recall that under strong duality, the KKT conditions are necessary for optimality. Given dual solutions u, v, any primal solution x satisfies the stationarity condition 0 f(x ) + m u i h i (x ) + In other words, x solves min x L(x, u, v ) r vi l j (x ) j=1 Generally, this reveals a characterization of primal solutions In particular, if this is satisfied uniquely (i.e., above problem has a unique minimizer), then the corresponding point must be the primal solution
23 23 References S. Boyd and L. Vandenberghe (2004), Convex optimization, Chapter 5 R. T. Rockafellar (1970), Convex analysis, Chapters 28 30
Duality Uses and Correspondences. Ryan Tibshirani Convex Optimization
Duality Uses and Correspondences Ryan Tibshirani Conve Optimization 10725 Recall that for the problem Last time: KKT conditions subject to f() h i () 0, i = 1,... m l j () = 0, j = 1,... r the KKT conditions
More informationConvex Optimization & Lagrange Duality
Convex Optimization & Lagrange Duality Chee Wei Tan CS 8292 : Advanced Topics in Convex Optimization and its Applications Fall 2010 Outline Convex optimization Optimality condition Lagrange duality KKT
More informationCOM S 672: Advanced Topics in Computational Models of Learning Optimization for Learning
COM S 672: Advanced Topics in Computational Models of Learning Optimization for Learning Lecture Note 4: Optimality Conditions Jia (Kevin) Liu Assistant Professor Department of Computer Science Iowa State
More informationCSE4830 Kernel Methods in Machine Learning
CSE4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This
More informationICSE4030 Kernel Methods in Machine Learning
ICSE4030 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 28. September, 2016 Juho Rousu 28. September, 2016 1 / 38 Convex optimization Convex optimisation This
More informationConvex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014
Convex Optimization Dani Yogatama School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA February 12, 2014 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12,
More informationExtreme Abridgment of Boyd and Vandenberghe s Convex Optimization
Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very wellwritten and a pleasure to read. The
More informationSupport vector machines
Support vector machines Guillaume Obozinski Ecole des Ponts  ParisTech SOCN course 2014 SVM, kernel methods and multiclass 1/23 Outline 1 Constrained optimization, Lagrangian duality and KKT 2 Support
More informationConstrained Optimization and Lagrangian Duality
CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may
More informationI.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec  Spring 2010
I.3. LMI DUALITY Didier HENRION henrion@laas.fr EECI Graduate School on Control Supélec  Spring 2010 Primal and dual For primal problem p = inf x g 0 (x) s.t. g i (x) 0 define Lagrangian L(x, z) = g 0
More informationLecture 18: Optimization Programming
Fall, 2016 Outline Unconstrained Optimization 1 Unconstrained Optimization 2 Equalityconstrained Optimization Inequalityconstrained Optimization Mixtureconstrained Optimization 3 Quadratic Programming
More informationIntroduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research
Introduction to Machine Learning Lecture 7 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Convex Optimization Differentiation Definition: let f : X R N R be a differentiable function,
More informationDuality. Geoff Gordon & Ryan Tibshirani Optimization /
Duality Geoff Gordon & Ryan Tibshirani Optimization 10725 / 36725 1 Duality in linear programs Suppose we want to find lower bound on the optimal value in our convex problem, B min x C f(x) E.g., consider
More informationLecture: Duality of LP, SOCP and SDP
1/33 Lecture: Duality of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:
More informationLecture 6: Conic Optimization September 8
IE 598: Big Data Optimization Fall 2016 Lecture 6: Conic Optimization September 8 Lecturer: Niao He Scriber: Juan Xu Overview In this lecture, we finish up our previous discussion on optimality conditions
More informationThe Lagrangian L : R d R m R r R is an (easier to optimize) lower bound on the original problem:
HT05: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford Convex Optimization and slides based on Arthur Gretton s Advanced Topics in Machine Learning course
More informationCOM S 578X: Optimization for Machine Learning
COM S 578X: Optimization for Machine Learning Lecture Note 5: Optimality Conditions Jia (Kevin) Liu Assistant Professor Department of Computer Science Iowa State University Ames Iowa USA Fall 2018 JKL
More informationSupport Vector Machines
Support Vector Machines Support vector machines (SVMs) are one of the central concepts in all of machine learning. They are simply a combination of two ideas: linear classification via maximum (or optimal
More information5. Duality. Lagrangian
5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized
More informationDuality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36725
Duality in Linear Programs Lecturer: Ryan Tibshirani Convex Optimization 10725/36725 1 Last time: proximal gradient descent Consider the problem x g(x) + h(x) with g, h convex, g differentiable, and
More informationOptimization for Machine Learning
Optimization for Machine Learning (Problems; Algorithms  A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html
More information14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.
CS/ECE/ISyE 524 Introduction to Optimization Spring 2016 17 14. Duality ˆ Upper and lower bounds ˆ General duality ˆ Constraint qualifications ˆ Counterexample ˆ Complementary slackness ˆ Examples ˆ Sensitivity
More informationLagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)
Lagrange Duality Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470  Convex Optimization Fall 201718, HKUST, Hong Kong Outline of Lecture Lagrangian Dual function Dual
More informationShiqian Ma, MAT258A: Numerical Optimization 1. Chapter 4. Subgradient
Shiqian Ma, MAT258A: Numerical Optimization 1 Chapter 4 Subgradient Shiqian Ma, MAT258A: Numerical Optimization 2 4.1. Subgradients definition subgradient calculus duality and optimality conditions Shiqian
More informationConvex Optimization Boyd & Vandenberghe. 5. Duality
5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized
More informationConvex Optimization M2
Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization
More informationDuality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities
Duality Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities Lagrangian Consider the optimization problem in standard form
More informationOptimality Conditions for Constrained Optimization
72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)
More informationIntroduction to Machine Learning Spring 2018 Note Duality. 1.1 Primal and Dual Problem
CS 189 Introduction to Machine Learning Spring 2018 Note 22 1 Duality As we have seen in our discussion of kernels, ridge regression can be viewed in two ways: (1) an optimization problem over the weights
More informationConstrained Optimization
1 / 22 Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University March 30, 2015 2 / 22 1. Equality constraints only 1.1 Reduced gradient 1.2 Lagrange
More informationOn the Method of Lagrange Multipliers
On the Method of Lagrange Multipliers Reza Nasiri Mahalati November 6, 2016 Most of what is in this note is taken from the Convex Optimization book by Stephen Boyd and Lieven Vandenberghe. This should
More informationThe KarushKuhnTucker (KKT) conditions
The KarushKuhnTucker (KKT) conditions In this section, we will give a set of sufficient (and at most times necessary) conditions for a x to be the solution of a given convex optimization problem. These
More informationSupport Vector Machines: Maximum Margin Classifiers
Support Vector Machines: Maximum Margin Classifiers Machine Learning and Pattern Recognition: September 16, 2008 Piotr Mirowski Based on slides by Sumit Chopra and FuJie Huang 1 Outline What is behind
More informationLecture 7: Convex Optimizations
Lecture 7: Convex Optimizations Radu Balan, David Levermore March 29, 2018 Convex Sets. Convex Functions A set S R n is called a convex set if for any points x, y S the line segment [x, y] := {tx + (1
More informationLagrangian Duality and Convex Optimization
Lagrangian Duality and Convex Optimization David Rosenberg New York University February 11, 2015 David Rosenberg (New York University) DSGA 1003 February 11, 2015 1 / 24 Introduction Why Convex Optimization?
More informationLecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem
Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 00 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R
More informationMotivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:
CDS270 Maryam Fazel Lecture 2 Topics from Optimization and Duality Motivation network utility maximization (NUM) problem: consider a network with S sources (users), each sending one flow at rate x s, through
More informationConvex Optimization Overview (cnt d)
Conve Optimization Overview (cnt d) Chuong B. Do November 29, 2009 During last week s section, we began our study of conve optimization, the study of mathematical optimization problems of the form, minimize
More informationNonlinear Programming and the KuhnTucker Conditions
Nonlinear Programming and the KuhnTucker Conditions The KuhnTucker (KT) conditions are firstorder conditions for constrained optimization problems, a generalization of the firstorder conditions we
More informationISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints
ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints Instructor: Prof. Kevin Ross Scribe: Nitish John October 18, 2011 1 The Basic Goal The main idea is to transform a given constrained
More informationLecture: Duality.
Lecture: Duality http://bicmr.pku.edu.cn/~wenzw/opt2016fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghe s lecture notes Introduction 2/35 Lagrange dual problem weak and strong
More informationMachine Learning. Support Vector Machines. Manfred Huber
Machine Learning Support Vector Machines Manfred Huber 2015 1 Support Vector Machines Both logistic regression and linear discriminant analysis learn a linear discriminant function to separate the data
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu
More informationHomework 3. Convex Optimization /36725
Homework 3 Convex Optimization 10725/36725 Due Friday October 14 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)
More informationUses of duality. Geoff Gordon & Ryan Tibshirani Optimization /
Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10725 / 36725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear
More informationLecture Notes on Support Vector Machine
Lecture Notes on Support Vector Machine Feng Li fli@sdu.edu.cn Shandong University, China 1 Hyperplane and Margin In a ndimensional space, a hyper plane is defined by ω T x + b = 0 (1) where ω R n is
More informationPrimal/Dual Decomposition Methods
Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470  Convex Optimization Fall 201819, HKUST, Hong Kong Outline of Lecture Subgradients
More informationSubgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus
1/41 Subgradient Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes definition subgradient calculus duality and optimality conditions directional derivative Basic inequality
More informationOptimality, Duality, Complementarity for Constrained Optimization
Optimality, Duality, Complementarity for Constrained Optimization Stephen Wright University of WisconsinMadison May 2014 Wright (UWMadison) Optimality, Duality, Complementarity May 2014 1 / 41 Linear
More information4TE3/6TE3. Algorithms for. Continuous Optimization
4TE3/6TE3 Algorithms for Continuous Optimization (Duality in Nonlinear Optimization ) Tamás TERLAKY Computing and Software McMaster University Hamilton, January 2004 terlaky@mcmaster.ca Tel: 27780 Optimality
More informationConvex Optimization and Modeling
Convex Optimization and Modeling Duality Theory and Optimality Conditions 5th lecture, 12.05.2010 Jun.Prof. Matthias Hein Program of today/next lecture Lagrangian and duality: the Lagrangian the dual
More informationSupport Vector Machines, Kernel SVM
Support Vector Machines, Kernel SVM Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 27, 2017 1 / 40 Outline 1 Administration 2 Review of last lecture 3 SVM
More informationE5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming
E5295/5B5749 Convex optimization with engineering applications Lecture 5 Convex programming and semidefinite programming A. Forsgren, KTH 1 Lecture 5 Convex optimization 2006/2007 Convex quadratic program
More information1. f(β) 0 (that is, β is a feasible point for the constraints)
xvi 2. The lasso for linear models 2.10 Bibliographic notes Appendix Convex optimization with constraints In this Appendix we present an overview of convex optimization concepts that are particularly useful
More informationLagrange duality. The Lagrangian. We consider an optimization program of the form
Lagrange duality Another way to arrive at the KKT conditions, and one which gives us some insight on solving constrained optimization problems, is through the Lagrange dual. The dual is a maximization
More informationEE/AA 578, Univ of Washington, Fall Duality
7. Duality EE/AA 578, Univ of Washington, Fall 2016 Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized
More informationLecture 3. Optimization Problems and Iterative Algorithms
Lecture 3 Optimization Problems and Iterative Algorithms January 13, 2016 This material was jointly developed with Angelia Nedić at UIUC for IE 598ns Outline Special Functions: Linear, Quadratic, Convex
More informationPattern Classification, and Quadratic Problems
Pattern Classification, and Quadratic Problems (Robert M. Freund) March 3, 24 c 24 Massachusetts Institute of Technology. 1 1 Overview Pattern Classification, Linear Classifiers, and Quadratic Optimization
More informationSupport Vector Machines
Support Vector Machines Sridhar Mahadevan mahadeva@cs.umass.edu University of Massachusetts Sridhar Mahadevan: CMPSCI 689 p. 1/32 Margin Classifiers margin b = 0 Sridhar Mahadevan: CMPSCI 689 p.
More informationSemidefinite Programming
Semidefinite Programming Notes by Bernd Sturmfels for the lecture on June 26, 208, in the IMPRS Ringvorlesung Introduction to Nonlinear Algebra The transition from linear algebra to nonlinear algebra has
More informationCoordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /
Coordinate descent Geoff Gordon & Ryan Tibshirani Optimization 10725 / 36725 1 Adding to the toolbox, with stats and ML in mind We ve seen several general and useful minimization tools Firstorder methods
More informationDual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36725
Dual methods and ADMM Barnabas Poczos & Ryan Tibshirani Convex Optimization 10725/36725 1 Given f : R n R, the function is called its conjugate Recall conjugate functions f (y) = max x R n yt x f(x)
More informationLinear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)
Support vector machines In a nutshell Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers) Solution only depends on a small subset of training
More informationConvex Optimization Lecture 6: KKT Conditions, and applications
Convex Optimization Lecture 6: KKT Conditions, and applications Dr. Michel Baes, IFOR / ETH Zürich Quick recall of last week s lecture Various aspects of convexity: The set of minimizers is convex. Convex
More informationsubject to (x 2)(x 4) u,
Exercises Basic definitions 5.1 A simple example. Consider the optimization problem with variable x R. minimize x 2 + 1 subject to (x 2)(x 4) 0, (a) Analysis of primal problem. Give the feasible set, the
More informationChap 2. Optimality conditions
Chap 2. Optimality conditions Version: 29092012 2.1 Optimality conditions in unconstrained optimization Recall the definitions of global, local minimizer. Geometry of minimization Consider for f C 1
More informationTutorial on Convex Optimization: Part II
Tutorial on Convex Optimization: Part II Dr. Khaled Ardah Communications Research Laboratory TU Ilmenau Dec. 18, 2018 Outline Convex Optimization Review Lagrangian Duality Applications Optimal Power Allocation
More informationConvex Optimization and Support Vector Machine
Convex Optimization and Support Vector Machine Problem 0. Consider a twoclass classification problem. The training data is L n = {(x 1, t 1 ),..., (x n, t n )}, where each t i { 1, 1} and x i R p. We
More informationConvex Optimization and SVM
Convex Optimization and SVM Problem 0. Cf lecture notes pages 12 to 18. Problem 1. (i) A slab is an intersection of two half spaces, hence convex. (ii) A wedge is an intersection of two half spaces, hence
More informationDuality Theory of Constrained Optimization
Duality Theory of Constrained Optimization Robert M. Freund April, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 2 1 The Practical Importance of Duality Duality is pervasive
More informationIntroduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur
Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Module  5 Lecture  22 SVM: The Dual Formulation Good morning.
More informationCS711008Z Algorithm Design and Analysis
CS711008Z Algorithm Design and Analysis Lecture 8 Linear programming: interior point method Dongbo Bu Institute of Computing Technology Chinese Academy of Sciences, Beijing, China 1 / 31 Outline Brief
More information10701 Recitation 5 Duality and SVM. Ahmed Hefny
10701 Recitation 5 Duality and SVM Ahmed Hefny Outline Langrangian and Duality The Lagrangian Duality Eamples Support Vector Machines Primal Formulation Dual Formulation Soft Margin and Hinge Loss Lagrangian
More informationLecture 14: Optimality Conditions for Conic Problems
EE 227A: Conve Optimization and Applications March 6, 2012 Lecture 14: Optimality Conditions for Conic Problems Lecturer: Laurent El Ghaoui Reading assignment: 5.5 of BV. 14.1 Optimality for Conic Problems
More informationMachine Learning. Lecture 6: Support Vector Machine. Feng Li.
Machine Learning Lecture 6: Support Vector Machine Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2018 Warm Up 2 / 80 Warm Up (Contd.)
More informationCSC 411 Lecture 17: Support Vector Machine
CSC 411 Lecture 17: Support Vector Machine Ethan Fetaya, James Lucas and Emad Andrews University of Toronto CSC411 Lec17 1 / 1 Today Maxmargin classification SVM Hard SVM Duality Soft SVM CSC411 Lec17
More informationNumerical Optimization
Constrained Optimization Computer Science and Automation Indian Institute of Science Bangalore 560 012, India. NPTEL Course on Constrained Optimization Constrained Optimization Problem: min h j (x) 0,
More informationNewton s Method. Ryan Tibshirani Convex Optimization /36725
Newton s Method Ryan Tibshirani Convex Optimization 10725/36725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, Properties and examples: f (y) = max x
More informationSupport Vector Machines
EE 17/7AT: Optimization Models in Engineering Section 11/1  April 014 Support Vector Machines Lecturer: Arturo Fernandez Scribe: Arturo Fernandez 1 Support Vector Machines Revisited 1.1 Strictly) Separable
More informationLinear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)
Support vector machines In a nutshell Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers) Solution only depends on a small subset of training
More informationSubgradients. subgradients and quasigradients. subgradient calculus. optimality conditions via subgradients. directional derivatives
Subgradients subgradients and quasigradients subgradient calculus optimality conditions via subgradients directional derivatives Prof. S. Boyd, EE392o, Stanford University Basic inequality recall basic
More informationSparsity and the Lasso
Sparsity and the Lasso Statistical Machine Learning, Spring 205 Ryan Tibshirani (with Larry Wasserman Regularization and the lasso. A bit of background If l 2 was the norm of the 20th century, then l is
More informationLecture 10: Duality in Linear Programs
10725/36725: Convex Optimization Spring 2015 Lecture 10: Duality in Linear Programs Lecturer: Ryan Tibshirani Scribes: Jingkun Gao and Ying Zhang Disclaimer: These notes have not been subjected to the
More informationSupport Vector Machine (SVM) and Kernel Methods
Support Vector Machine (SVM) and Kernel Methods CE717: Machine Learning Sharif University of Technology Fall 2015 Soleymani Outline Margin concept HardMargin SVM SoftMargin SVM Dual Problems of HardMargin
More informationLagrange Relaxation and Duality
Lagrange Relaxation and Duality As we have already known, constrained optimization problems are harder to solve than unconstrained problems. By relaxation we can solve a more difficult problem by a simpler
More informationSeminars on Mathematics for Economics and Finance Topic 5: Optimization KuhnTucker conditions for problems with inequality constraints 1
Seminars on Mathematics for Economics and Finance Topic 5: Optimization KuhnTucker conditions for problems with inequality constraints 1 Session: 15 Aug 2015 (Mon), 10:00am 1:00pm I. Optimization with
More informationSupport Vector Machines and Kernel Methods
2018 CS420 Machine Learning, Lecture 3 Hangout from Prof. Andrew Ng. http://cs229.stanford.edu/notes/cs229notes3.pdf Support Vector Machines and Kernel Methods Weinan Zhang Shanghai Jiao Tong University
More informationInterior Point Algorithms for Constrained Convex Optimization
Interior Point Algorithms for Constrained Convex Optimization Chee Wei Tan CS 8292 : Advanced Topics in Convex Optimization and its Applications Fall 2010 Outline Inequality constrained minimization problems
More informationLagrangian Duality Theory
Lagrangian Duality Theory Yinyu Ye Department of Management Science and Engineering Stanford University Stanford, CA 94305, U.S.A. http://www.stanford.edu/ yyye Chapter 14.14 1 Recall Primal and Dual
More informationGeneralization to inequality constrained problem. Maximize
Lecture 11. 26 September 2006 Review of Lecture #10: Second order optimality conditions necessary condition, sufficient condition. If the necessary condition is violated the point cannot be a local minimum
More informationLecture 23: November 21
10725/36725: Convex Optimization Fall 2016 Lecturer: Ryan Tibshirani Lecture 23: November 21 Scribes: Yifan Sun, Ananya Kumar, Xin Lu Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:
More informationLecture 8. Strong Duality Results. September 22, 2008
Strong Duality Results September 22, 2008 Outline Lecture 8 Slater Condition and its Variations Convex Objective with Linear Inequality Constraints Quadratic Objective over Quadratic Constraints Representation
More informationSubgradients. subgradients. strong and weak subgradient calculus. optimality conditions via subgradients. directional derivatives
Subgradients subgradients strong and weak subgradient calculus optimality conditions via subgradients directional derivatives Prof. S. Boyd, EE364b, Stanford University Basic inequality recall basic inequality
More informationNonlinear Optimization: What s important?
Nonlinear Optimization: What s important? Julian Hall 10th May 2012 Convexity: convex problems A local minimizer is a global minimizer A solution of f (x) = 0 (stationary point) is a minimizer A global
More informationThe KarushKuhnTucker conditions
Chapter 6 The KarushKuhnTucker conditions 6.1 Introduction In this chapter we derive the first order necessary condition known as KarushKuhnTucker (KKT) conditions. To this aim we introduce the alternative
More informationSupport Vector Machines for Regression
COMP566 Rohan Shah (1) Support Vector Machines for Regression Provided with n training data points {(x 1, y 1 ), (x 2, y 2 ),, (x n, y n )} R s R we seek a function f for a fixed ɛ > 0 such that: f(x
More informationBindel, Spring 2017 Numerical Analysis (CS 4220) Notes for So far, we have considered unconstrained optimization problems.
Consider constraints Notes for 20170424 So far, we have considered unconstrained optimization problems. The constrained problem is minimize φ(x) s.t. x Ω where Ω R n. We usually define x in terms of
More information