COM S 578X: Optimization for Machine Learning

Size: px
Start display at page:

Download "COM S 578X: Optimization for Machine Learning"

Transcription

1 COM S 578X: Optimization for Machine Learning Lecture Note 5: Optimality Conditions Jia (Kevin) Liu Assistant Professor Department of Computer Science Iowa State University Ames Iowa USA Fall 2018 JKL (CS@ISU) COM S 578X: Lecture 5 1/18

2 Recap Last Lecture Given a minimization problem Minimize f(x) subject to g i (x) apple 0 i =1m u i 0 We define the Lagrangian: h j (x) =0 j =1p v j unconstrained L(x u v) =f(x)+ and the Lagrangian dual function: mx u i g i (x)+ i=1 (u v) =min x L(x u v) px v j h j (x) j=1 JKL (CS@ISU) COM S 578X: Lecture 5 2/18

3 Recap Last Lecture The subsequent Lagrangian dual problem is: Important properties: Maximize (u v) subject to u 0 Dual problem is always convex (or is always concave) even if the primal problem is nonconvex The weak duality property always holds ie the primal and dual optimal values p and d satisfy p d Slater s condition: for convex primal if 9 x such that f 1 (x) < 0f m (x) < 0 and h 1 (x) =0h p (x) =0 then strong duality holds: p = d JKL (CS@ISU) COM S 578X: Lecture 5 3/18

4 Outline Today: KKT conditions Geometric interpretation Relevant examples in machine learning and other areas JKL COM S 578X: Lecture 5 4/18

5 KarushKuhnTucker Conditions Given general problem Minimize f(x) subject to g i (x) apple 0 i =1m u i 0 h j (x) =0 j =1p v j unconstrained Lek 417 = feel tug cast IT he ) The KarushKuhnTucker (KKT) conditions are: Stationarity (ST): r x f(x)+ P m i=1 u ir x g i (x)+ P p j=1 v jr x h j (x) =0 Complementary slackness (CS): u i g i (x) =0 8i Primal feasibility (PF): g i (x) apple 0 h j (x) =0 8i j Dual feasibility (DF): u i 0 8i either Ui = o or gift ) = o JKL (CS@ISU) COM S 578X: Lecture 5 5/18

6 ex } KKT Necessity Theorem 1 If x and u v be primal and dual solutions w/ zero duality gap (eg implied by convexity and Slater s condition) then (x u v ) satisfy KKT conditions Proof We have PF and DF for free from the assumption Also x and (u v ) are primal & dual solutions with strong duality ) That is all these inequalities are equalities Then: at strong primal opt 7 CE 't 'd ) dual being opt c Ii strong def of duality dual for mx px f(x )= (u 1d v )=min f(x)+ u i g i (x)+ vj h j (x) x depot i=1 j=1 mx DF PF px PF min I apple f(x go )+ u EO i g i (x )+ vj to h j (x ) apple f(x g ) A i=1 j=1 SO =o LCEMlk ) = ET x minimizes L(x u v ) over x 2 R n (unconstrained) ) Gradient of L(x u v ) must be 0 at x iethestationarity condition Since u i g i(x ) apple 0 (PF & DF) we must have P m B i=1 u i g i(x )=0ie complementary slackness condition TO KKT JKL (CS@ISU) COM S 578X: Lecture 5 6/18 get

7 KKT Su ciency ( Et y B duel opt cut Theorem 2 If the primal problem is convex Its and x and (u v ) satisfy KKT conditions then x and (u v ) are primal and dual optimal solutions respectively Proof If x and (u v ) satisfy KKT conditions then L = fat ) t IT find t IT Ice ) From CST ) : af mx CE't UT I )=Q at is a px (u v ) (a) minimizer = f(x )+ u i g i (x )+ vj go h j (x of ) LCE it 't 1*7 A 1ST i=1 j=1 = O ) = o (b) = f(x Ccs ) ) CPF ) where (a) follows from ST and (b) follows from CS 't I 't ) is KKT primal is convex } { It is primal opt Therefore the duality gap is zero Note that x and (u v ) are PF and DF Hence they are primal and dual optimal respectively TO JKL (CS@ISU) COM S 578X: Lecture 5 7/18

8 In Summary So putting things together Theorem 3 For a convex optimization problem with strong duality (eg implied by Slater s conditions or other constraints qualifications): x and (u v ) are primal and dual solutions () x and (u v ) satisfy KKT conditions Warning: This statement is only true for convex optimization problems For nonconvex optimization problems KKT conditions are neither necessary nor su cient! (more on this shortly) JKL (CS@ISU) COM S 578X: Lecture 5 8/18

9 Where Does This Name Come From? Older books/papers referred to this as the KT (KuhnTucker) conditions First appeared in a publication by Kuhn and Tucker in 1951 Kuhn & Tucker shared the John von Neumann Theory Prize in 1980 Later people realized that Karush had the same conditions in his unpublished master s thesis in 1939 William Karush Harold W Kuhn Albert W Tucker A Fun Read: R W Cottle William Karush and the KKT Theorem Documenta Mathematica 2012 pp JKL (CS@ISU) COM S 578X: Lecture 5 9/18

10 Other Optimality Conditions KKT conditions are a special case of the more general Fritz John Conditions: u 0 rf(x )+ where u 0 could be 0 mx u i rg i (x )+ i=1 px v j rh j (x )=0 In turn Fritz John conditions (hence KKT) belong to a wider class of the firstorder necessary conditions (FONC) which allow for nonsmooth functions using subderivatives j=1 Further there are a whole class secondorder necessary & su conditiosn (SONCSOSC) also in KKT style cient For an excellent treatment on optimality conditions see [BSS Ch4 Ch6] JKL (CS@ISU) COM S 578X: Lecture 5 10 / 18

11 ICz*gE{ physics afcz* Geometric Interpretation of KKT Set of binding constraints at at C active ctnghtg i' gott 'T = o } Ccs ) deer interpretation : : ) pulling force * i gope 'T ie ICE 't ) : 9ft 'T Afca 'T vs again sum _ O JKL (CS@ISU) COM S 578X: Lecture 5 11 / 18

12 drr When is KKT neither su cient nor necessary? (Not necc): x is a (local) minimum ; x is a KKT point m grotto It is but opt NOT Hey deer # E feats uxgiatttuixgee's E Kouga tuua > o g; qµ K I Note : It so is Fritz John g pt ) no 01 (Not su ): x is a KKT point ; x is a (local) minimum 942 ) SO guy ) is nonconvex ohj : min ETZ Gekko gaa*yt# ;: #qdy ± * is KKT Fu : a so smh that JKL (CS@ISU) COM S 578X: Lecture 5 12 / 18

13 A o at att IT Example 1: Quadratic Problems with Equality Constraints Consider for Q 0 the following quadratic programming problem is: Lagrangian I ETE at etat IT LAI ) : 1 Minimize x 2 x> Qx + c > x subject to Ax = 0 0u A convex problem w/o inequality constraints By KKT x is primal optimal i CST ) : = Eat Et ATI E apple apple apple Q A > x c CPF) = : = A a e A 0 u 0 c Df ) & Ccs : ) Implied by CPF) for some dual variable u A linear equation system combines ST & PF (CS and DF vacuous) Often arises from using Newton s method to solved equalityconstrained problems {min x f(x) Ax = b} flat t the # cattle tochaa*# = fealty + after Tca I Tim const By Taylor 's so expansion : 't Note : Aaa =L E tee I k = It Cz zit En JKL (CS@ISU) COM S 578X: Lecture 5 13 / 18 Q

14 R b Ui Example 2: Support Vector Machine Given labels y 2 { 1 1} n featurevectorsx 1 x m LetX[x 1 x m ] > win Recall from Lecture 1 that the support vector machine problem: tbh Minimize wb 1 mx 2 kwk2 + C i=1 subject to y i (w > x i + b) 1 i i =1m uizofi * i c pts { i 0 i =1m Visconti Slater 's condition holds ( DF ) Introducing dual variables u v 0 to obtain the KKT system: Quadratic eh y Lagrangian Ellen 't C Ei tme will Ei Yi Htt it b) A Tavi mx mx Ei : II Agne in q b (ST): 0= u i y i w = u i y i x i u = C1 v i=1 i=1 (CS): v i i =0 u i 1 i y i (x > i w + b) =0 i =1m i Taking der w t we uiyin = Q wrto :! uiyi=o writ Ei : C Vi = or Vi JKL (CS@ISU) COM S 578X: Lecture 5 14 / 18

15 Example 2: Support Vector Machine re = Diagmie= Ey Hence at optimality we have w = P m i=1 u iy i x i andu i is nonzero only if y i (x > i w + b) =1 i Such points are called the support points For support point i if i =0thenx i lies on the edge of margin and ' ' u i 2 (0C] 4=0 Viso ke CE For support point i if i 6=0thenx i lies on wrong side of margin and u i = C Ei to Vi o a CE = 2 Margin: kwk w > x + b =0 5 KKT conditions do not really give us a way to find solution here but gives better understanding & useful in proofs 1 kwk In fact we can use this to screen away nonsupport points before performing optimization (lowercomplexity) 1 kwk JKL (CS@ISU) COM S 578X: Lecture 5 15 / 18

16 I Ni Example 3: Waterfilling Example from [BV]: Consider the problem Minimize x nx log( i + x i ) 43 I i=1 n VER I ai = I subject to x 0 1 > x =1 i =1m In Information Theory: log( i + x i ) is the communication rate of ith channel Introducing dual variables u v 0 to obtain the KKT system: (ST): 1/( i + x i ) u i + v =0 i =1n (CS): u i x i =0 i =1n (PF): x 0 1 > x =1 (DF): u Eliminating u yields: 1/( i + x i ) apple v 0v unconstrained i =1n Arise from IT : log Cali tai ) : Set LCE x i (v 1/( i + x i )) = 0 i =1n x 0 1 > x =1 = hog ai t lag f It ) sad i : 2L Tai Ui = a I ) = II TE ni Mi tv ( Epa q 1 Li tri ni 30 I lug Hit ai ) ) Ui TV = o 20 tv Hi Hi JKL (CS@ISU) COM S 578X: Lecture 5 16 / 18

17 Example 3: Waterfilling ST and CS implies that: ( 1/v i if v<1/ i x i = =) x i = max{0 1/v i } i =1n 0 if v 1/ i Also from PF ie 1 > x =1wehave: nx max{0 1/v i } =1 i=1 Univariate In of u linear in terms of I Univariate equation piecewise linear in 1/v and not hard to solve This reduced problem is referred to as the waterfilling solution (From [BV] pp 246) water level I ; i i ' i l ; I I I I! I I i I ' JKL (CS@ISU) COM S 578X: Lecture 5 17 / 18

18 Next Class Gradient Descent JKL COM S 578X: Lecture 5 18 / 18

COM S 672: Advanced Topics in Computational Models of Learning Optimization for Learning

COM S 672: Advanced Topics in Computational Models of Learning Optimization for Learning COM S 672: Advanced Topics in Computational Models of Learning Optimization for Learning Lecture Note 4: Optimality Conditions Jia (Kevin) Liu Assistant Professor Department of Computer Science Iowa State

More information

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725 Karush-Kuhn-Tucker Conditions Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given a minimization problem Last time: duality min x subject to f(x) h i (x) 0, i = 1,... m l j (x) = 0, j =

More information

COM S 578X: Optimization for Machine Learning

COM S 578X: Optimization for Machine Learning COM S 578X: Optimization for Machine Learning Lecture Note 4: Duality Jia (Kevin) Liu Assistant Professor Department of Computer Science Iowa State University, Ames, Iowa, USA Fall 2018 JKL (CS@ISU) COM

More information

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010 I.3. LMI DUALITY Didier HENRION henrion@laas.fr EECI Graduate School on Control Supélec - Spring 2010 Primal and dual For primal problem p = inf x g 0 (x) s.t. g i (x) 0 define Lagrangian L(x, z) = g 0

More information

ICS-E4030 Kernel Methods in Machine Learning

ICS-E4030 Kernel Methods in Machine Learning ICS-E4030 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 28. September, 2016 Juho Rousu 28. September, 2016 1 / 38 Convex optimization Convex optimisation This

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

Support vector machines

Support vector machines Support vector machines Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 SVM, kernel methods and multiclass 1/23 Outline 1 Constrained optimization, Lagrangian duality and KKT 2 Support

More information

Lecture 18: Optimization Programming

Lecture 18: Optimization Programming Fall, 2016 Outline Unconstrained Optimization 1 Unconstrained Optimization 2 Equality-constrained Optimization Inequality-constrained Optimization Mixture-constrained Optimization 3 Quadratic Programming

More information

CS-E4830 Kernel Methods in Machine Learning

CS-E4830 Kernel Methods in Machine Learning CS-E4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This

More information

Convex Optimization & Lagrange Duality

Convex Optimization & Lagrange Duality Convex Optimization & Lagrange Duality Chee Wei Tan CS 8292 : Advanced Topics in Convex Optimization and its Applications Fall 2010 Outline Convex optimization Optimality condition Lagrange duality KKT

More information

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006 Quiz Discussion IE417: Nonlinear Programming: Lecture 12 Jeff Linderoth Department of Industrial and Systems Engineering Lehigh University 16th March 2006 Motivation Why do we care? We are interested in

More information

5. Duality. Lagrangian

5. Duality. Lagrangian 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem: CDS270 Maryam Fazel Lecture 2 Topics from Optimization and Duality Motivation network utility maximization (NUM) problem: consider a network with S sources (users), each sending one flow at rate x s, through

More information

Convex Optimization M2

Convex Optimization M2 Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization

More information

Lecture: Duality.

Lecture: Duality. Lecture: Duality http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghe s lecture notes Introduction 2/35 Lagrange dual problem weak and strong

More information

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014 Convex Optimization Dani Yogatama School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA February 12, 2014 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12,

More information

Support Vector Machines: Maximum Margin Classifiers

Support Vector Machines: Maximum Margin Classifiers Support Vector Machines: Maximum Margin Classifiers Machine Learning and Pattern Recognition: September 16, 2008 Piotr Mirowski Based on slides by Sumit Chopra and Fu-Jie Huang 1 Outline What is behind

More information

Convex Optimization Boyd & Vandenberghe. 5. Duality

Convex Optimization Boyd & Vandenberghe. 5. Duality 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

The Lagrangian L : R d R m R r R is an (easier to optimize) lower bound on the original problem:

The Lagrangian L : R d R m R r R is an (easier to optimize) lower bound on the original problem: HT05: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford Convex Optimization and slides based on Arthur Gretton s Advanced Topics in Machine Learning course

More information

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST) Lagrange Duality Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline of Lecture Lagrangian Dual function Dual

More information

University of California, Davis Department of Agricultural and Resource Economics ARE 252 Lecture Notes 2 Quirino Paris

University of California, Davis Department of Agricultural and Resource Economics ARE 252 Lecture Notes 2 Quirino Paris University of California, Davis Department of Agricultural and Resource Economics ARE 5 Lecture Notes Quirino Paris Karush-Kuhn-Tucker conditions................................................. page Specification

More information

Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities

Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities Duality Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities Lagrangian Consider the optimization problem in standard form

More information

Constrained Optimization

Constrained Optimization 1 / 22 Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University March 30, 2015 2 / 22 1. Equality constraints only 1.1 Reduced gradient 1.2 Lagrange

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized

More information

Lectures 9 and 10: Constrained optimization problems and their optimality conditions

Lectures 9 and 10: Constrained optimization problems and their optimality conditions Lectures 9 and 10: Constrained optimization problems and their optimality conditions Coralia Cartis, Mathematical Institute, University of Oxford C6.2/B2: Continuous Optimization Lectures 9 and 10: Constrained

More information

Lecture 6: Conic Optimization September 8

Lecture 6: Conic Optimization September 8 IE 598: Big Data Optimization Fall 2016 Lecture 6: Conic Optimization September 8 Lecturer: Niao He Scriber: Juan Xu Overview In this lecture, we finish up our previous discussion on optimality conditions

More information

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research Introduction to Machine Learning Lecture 7 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Convex Optimization Differentiation Definition: let f : X R N R be a differentiable function,

More information

WHY DUALITY? Gradient descent Newton s method Quasi-newton Conjugate gradients. No constraints. Non-differentiable ???? Constrained problems? ????

WHY DUALITY? Gradient descent Newton s method Quasi-newton Conjugate gradients. No constraints. Non-differentiable ???? Constrained problems? ???? DUALITY WHY DUALITY? No constraints f(x) Non-differentiable f(x) Gradient descent Newton s method Quasi-newton Conjugate gradients etc???? Constrained problems? f(x) subject to g(x) apple 0???? h(x) =0

More information

Optimality, Duality, Complementarity for Constrained Optimization

Optimality, Duality, Complementarity for Constrained Optimization Optimality, Duality, Complementarity for Constrained Optimization Stephen Wright University of Wisconsin-Madison May 2014 Wright (UW-Madison) Optimality, Duality, Complementarity May 2014 1 / 41 Linear

More information

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints Instructor: Prof. Kevin Ross Scribe: Nitish John October 18, 2011 1 The Basic Goal The main idea is to transform a given constrained

More information

Machine Learning. Lecture 6: Support Vector Machine. Feng Li.

Machine Learning. Lecture 6: Support Vector Machine. Feng Li. Machine Learning Lecture 6: Support Vector Machine Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2018 Warm Up 2 / 80 Warm Up (Contd.)

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

More information

CS269: Machine Learning Theory Lecture 16: SVMs and Kernels November 17, 2010

CS269: Machine Learning Theory Lecture 16: SVMs and Kernels November 17, 2010 CS269: Machine Learning Theory Lecture 6: SVMs and Kernels November 7, 200 Lecturer: Jennifer Wortman Vaughan Scribe: Jason Au, Ling Fang, Kwanho Lee Today, we will continue on the topic of support vector

More information

Lecture: Duality of LP, SOCP and SDP

Lecture: Duality of LP, SOCP and SDP 1/33 Lecture: Duality of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:

More information

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The

More information

4TE3/6TE3. Algorithms for. Continuous Optimization

4TE3/6TE3. Algorithms for. Continuous Optimization 4TE3/6TE3 Algorithms for Continuous Optimization (Duality in Nonlinear Optimization ) Tamás TERLAKY Computing and Software McMaster University Hamilton, January 2004 terlaky@mcmaster.ca Tel: 27780 Optimality

More information

Lecture 2: Linear SVM in the Dual

Lecture 2: Linear SVM in the Dual Lecture 2: Linear SVM in the Dual Stéphane Canu stephane.canu@litislab.eu São Paulo 2015 July 22, 2015 Road map 1 Linear SVM Optimization in 10 slides Equality constraints Inequality constraints Dual formulation

More information

Nonlinear Programming and the Kuhn-Tucker Conditions

Nonlinear Programming and the Kuhn-Tucker Conditions Nonlinear Programming and the Kuhn-Tucker Conditions The Kuhn-Tucker (KT) conditions are first-order conditions for constrained optimization problems, a generalization of the first-order conditions we

More information

Kernel Machines. Pradeep Ravikumar Co-instructor: Manuela Veloso. Machine Learning

Kernel Machines. Pradeep Ravikumar Co-instructor: Manuela Veloso. Machine Learning Kernel Machines Pradeep Ravikumar Co-instructor: Manuela Veloso Machine Learning 10-701 SVM linearly separable case n training points (x 1,, x n ) d features x j is a d-dimensional vector Primal problem:

More information

EE/AA 578, Univ of Washington, Fall Duality

EE/AA 578, Univ of Washington, Fall Duality 7. Duality EE/AA 578, Univ of Washington, Fall 2016 Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

NONLINEAR. (Hillier & Lieberman Introduction to Operations Research, 8 th edition)

NONLINEAR. (Hillier & Lieberman Introduction to Operations Research, 8 th edition) NONLINEAR PROGRAMMING (Hillier & Lieberman Introduction to Operations Research, 8 th edition) Nonlinear Programming g Linear programming has a fundamental role in OR. In linear programming all its functions

More information

Duality Uses and Correspondences. Ryan Tibshirani Convex Optimization

Duality Uses and Correspondences. Ryan Tibshirani Convex Optimization Duality Uses and Correspondences Ryan Tibshirani Conve Optimization 10-725 Recall that for the problem Last time: KKT conditions subject to f() h i () 0, i = 1,... m l j () = 0, j = 1,... r the KKT conditions

More information

Optimality Conditions for Constrained Optimization

Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

More information

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness. CS/ECE/ISyE 524 Introduction to Optimization Spring 2016 17 14. Duality ˆ Upper and lower bounds ˆ General duality ˆ Constraint qualifications ˆ Counterexample ˆ Complementary slackness ˆ Examples ˆ Sensitivity

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Support vector machines (SVMs) are one of the central concepts in all of machine learning. They are simply a combination of two ideas: linear classification via maximum (or optimal

More information

Machine Learning A Geometric Approach

Machine Learning A Geometric Approach Machine Learning A Geometric Approach CIML book Chap 7.7 Linear Classification: Support Vector Machines (SVM) Professor Liang Huang some slides from Alex Smola (CMU) Linear Separator Ham Spam From Perceptron

More information

Support Vector Machines and Kernel Methods

Support Vector Machines and Kernel Methods 2018 CS420 Machine Learning, Lecture 3 Hangout from Prof. Andrew Ng. http://cs229.stanford.edu/notes/cs229-notes3.pdf Support Vector Machines and Kernel Methods Weinan Zhang Shanghai Jiao Tong University

More information

Homework Set #6 - Solutions

Homework Set #6 - Solutions EE 15 - Applications of Convex Optimization in Signal Processing and Communications Dr Andre Tkacenko JPL Third Term 11-1 Homework Set #6 - Solutions 1 a The feasible set is the interval [ 4] The unique

More information

On the Method of Lagrange Multipliers

On the Method of Lagrange Multipliers On the Method of Lagrange Multipliers Reza Nasiri Mahalati November 6, 2016 Most of what is in this note is taken from the Convex Optimization book by Stephen Boyd and Lieven Vandenberghe. This should

More information

Linear, Binary SVM Classifiers

Linear, Binary SVM Classifiers Linear, Binary SVM Classifiers COMPSCI 37D Machine Learning COMPSCI 37D Machine Learning Linear, Binary SVM Classifiers / 6 Outline What Linear, Binary SVM Classifiers Do 2 Margin I 3 Loss and Regularized

More information

Nonlinear Optimization: What s important?

Nonlinear Optimization: What s important? Nonlinear Optimization: What s important? Julian Hall 10th May 2012 Convexity: convex problems A local minimizer is a global minimizer A solution of f (x) = 0 (stationary point) is a minimizer A global

More information

subject to (x 2)(x 4) u,

subject to (x 2)(x 4) u, Exercises Basic definitions 5.1 A simple example. Consider the optimization problem with variable x R. minimize x 2 + 1 subject to (x 2)(x 4) 0, (a) Analysis of primal problem. Give the feasible set, the

More information

Numerical Optimization

Numerical Optimization Constrained Optimization Computer Science and Automation Indian Institute of Science Bangalore 560 012, India. NPTEL Course on Constrained Optimization Constrained Optimization Problem: min h j (x) 0,

More information

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming

E5295/5B5749 Convex optimization with engineering applications. Lecture 5. Convex programming and semidefinite programming E5295/5B5749 Convex optimization with engineering applications Lecture 5 Convex programming and semidefinite programming A. Forsgren, KTH 1 Lecture 5 Convex optimization 2006/2007 Convex quadratic program

More information

Lecture Notes on Support Vector Machine

Lecture Notes on Support Vector Machine Lecture Notes on Support Vector Machine Feng Li fli@sdu.edu.cn Shandong University, China 1 Hyperplane and Margin In a n-dimensional space, a hyper plane is defined by ω T x + b = 0 (1) where ω R n is

More information

A Brief Review on Convex Optimization

A Brief Review on Convex Optimization A Brief Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one convex, two nonconvex sets): A Brief Review

More information

Recita,on: Loss, Regulariza,on, and Dual*

Recita,on: Loss, Regulariza,on, and Dual* 10-701 Recita,on: Loss, Regulariza,on, and Dual* Jay- Yoon Lee 02/26/2015 *Adopted figures from 10725 lecture slides and from the book Elements of Sta,s,cal Learning Loss and Regulariza,on Op,miza,on problem

More information

LECTURE 7 Support vector machines

LECTURE 7 Support vector machines LECTURE 7 Support vector machines SVMs have been used in a multitude of applications and are one of the most popular machine learning algorithms. We will derive the SVM algorithm from two perspectives:

More information

Enhanced Fritz John Optimality Conditions and Sensitivity Analysis

Enhanced Fritz John Optimality Conditions and Sensitivity Analysis Enhanced Fritz John Optimality Conditions and Sensitivity Analysis Dimitri P. Bertsekas Laboratory for Information and Decision Systems Massachusetts Institute of Technology March 2016 1 / 27 Constrained

More information

CS711008Z Algorithm Design and Analysis

CS711008Z Algorithm Design and Analysis CS711008Z Algorithm Design and Analysis Lecture 8 Linear programming: interior point method Dongbo Bu Institute of Computing Technology Chinese Academy of Sciences, Beijing, China 1 / 31 Outline Brief

More information

Generalization to inequality constrained problem. Maximize

Generalization to inequality constrained problem. Maximize Lecture 11. 26 September 2006 Review of Lecture #10: Second order optimality conditions necessary condition, sufficient condition. If the necessary condition is violated the point cannot be a local minimum

More information

More on Lagrange multipliers

More on Lagrange multipliers More on Lagrange multipliers CE 377K April 21, 2015 REVIEW The standard form for a nonlinear optimization problem is min x f (x) s.t. g 1 (x) 0. g l (x) 0 h 1 (x) = 0. h m (x) = 0 The objective function

More information

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Lagrange duality. The Lagrangian. We consider an optimization program of the form Lagrange duality Another way to arrive at the KKT conditions, and one which gives us some insight on solving constrained optimization problems, is through the Lagrange dual. The dual is a maximization

More information

Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Module - 5 Lecture - 22 SVM: The Dual Formulation Good morning.

More information

CSCI : Optimization and Control of Networks. Review on Convex Optimization

CSCI : Optimization and Control of Networks. Review on Convex Optimization CSCI7000-016: Optimization and Control of Networks Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one

More information

Lecture 7: Convex Optimizations

Lecture 7: Convex Optimizations Lecture 7: Convex Optimizations Radu Balan, David Levermore March 29, 2018 Convex Sets. Convex Functions A set S R n is called a convex set if for any points x, y S the line segment [x, y] := {tx + (1

More information

Pattern Classification, and Quadratic Problems

Pattern Classification, and Quadratic Problems Pattern Classification, and Quadratic Problems (Robert M. Freund) March 3, 24 c 24 Massachusetts Institute of Technology. 1 1 Overview Pattern Classification, Linear Classifiers, and Quadratic Optimization

More information

LP Duality: outline. Duality theory for Linear Programming. alternatives. optimization I Idea: polyhedra

LP Duality: outline. Duality theory for Linear Programming. alternatives. optimization I Idea: polyhedra LP Duality: outline I Motivation and definition of a dual LP I Weak duality I Separating hyperplane theorem and theorems of the alternatives I Strong duality and complementary slackness I Using duality

More information

LINEAR PROGRAMMING II

LINEAR PROGRAMMING II LINEAR PROGRAMMING II LP duality strong duality theorem bonus proof of LP duality applications Lecture slides by Kevin Wayne Last updated on 7/25/17 11:09 AM LINEAR PROGRAMMING II LP duality Strong duality

More information

Duality. Geoff Gordon & Ryan Tibshirani Optimization /

Duality. Geoff Gordon & Ryan Tibshirani Optimization / Duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Duality in linear programs Suppose we want to find lower bound on the optimal value in our convex problem, B min x C f(x) E.g., consider

More information

IE 5531: Engineering Optimization I

IE 5531: Engineering Optimization I IE 5531: Engineering Optimization I Lecture 12: Nonlinear optimization, continued Prof. John Gunnar Carlsson October 20, 2010 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20,

More information

Convex Optimization Lecture 6: KKT Conditions, and applications

Convex Optimization Lecture 6: KKT Conditions, and applications Convex Optimization Lecture 6: KKT Conditions, and applications Dr. Michel Baes, IFOR / ETH Zürich Quick recall of last week s lecture Various aspects of convexity: The set of minimizers is convex. Convex

More information

LECTURE 10 LECTURE OUTLINE

LECTURE 10 LECTURE OUTLINE LECTURE 10 LECTURE OUTLINE Min Common/Max Crossing Th. III Nonlinear Farkas Lemma/Linear Constraints Linear Programming Duality Convex Programming Duality Optimality Conditions Reading: Sections 4.5, 5.1,5.2,

More information

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

More information

Support Vector Machines, Kernel SVM

Support Vector Machines, Kernel SVM Support Vector Machines, Kernel SVM Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 27, 2017 1 / 40 Outline 1 Administration 2 Review of last lecture 3 SVM

More information

Gradient Descent. Dr. Xiaowei Huang

Gradient Descent. Dr. Xiaowei Huang Gradient Descent Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Three machine learning algorithms: decision tree learning k-nn linear regression only optimization objectives are discussed,

More information

Convex Optimization. Lecture 12 - Equality Constrained Optimization. Instructor: Yuanzhang Xiao. Fall University of Hawaii at Manoa

Convex Optimization. Lecture 12 - Equality Constrained Optimization. Instructor: Yuanzhang Xiao. Fall University of Hawaii at Manoa Convex Optimization Lecture 12 - Equality Constrained Optimization Instructor: Yuanzhang Xiao University of Hawaii at Manoa Fall 2017 1 / 19 Today s Lecture 1 Basic Concepts 2 for Equality Constrained

More information

Lecture 8. Strong Duality Results. September 22, 2008

Lecture 8. Strong Duality Results. September 22, 2008 Strong Duality Results September 22, 2008 Outline Lecture 8 Slater Condition and its Variations Convex Objective with Linear Inequality Constraints Quadratic Objective over Quadratic Constraints Representation

More information

CSC 411 Lecture 17: Support Vector Machine

CSC 411 Lecture 17: Support Vector Machine CSC 411 Lecture 17: Support Vector Machine Ethan Fetaya, James Lucas and Emad Andrews University of Toronto CSC411 Lec17 1 / 1 Today Max-margin classification SVM Hard SVM Duality Soft SVM CSC411 Lec17

More information

Machine Learning. Support Vector Machines. Manfred Huber

Machine Learning. Support Vector Machines. Manfred Huber Machine Learning Support Vector Machines Manfred Huber 2015 1 Support Vector Machines Both logistic regression and linear discriminant analysis learn a linear discriminant function to separate the data

More information

CSCI5654 (Linear Programming, Fall 2013) Lecture-8. Lecture 8 Slide# 1

CSCI5654 (Linear Programming, Fall 2013) Lecture-8. Lecture 8 Slide# 1 CSCI5654 (Linear Programming, Fall 2013) Lecture-8 Lecture 8 Slide# 1 Today s Lecture 1. Recap of dual variables and strong duality. 2. Complementary Slackness Theorem. 3. Interpretation of dual variables.

More information

Chapter 2. Optimization. Gradients, convexity, and ALS

Chapter 2. Optimization. Gradients, convexity, and ALS Chapter 2 Optimization Gradients, convexity, and ALS Contents Background Gradient descent Stochastic gradient descent Newton s method Alternating least squares KKT conditions 2 Motivation We can solve

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Sridhar Mahadevan mahadeva@cs.umass.edu University of Massachusetts Sridhar Mahadevan: CMPSCI 689 p. 1/32 Margin Classifiers margin b = 0 Sridhar Mahadevan: CMPSCI 689 p.

More information

Support Vector Machines

Support Vector Machines Wien, June, 2010 Paul Hofmarcher, Stefan Theussl, WU Wien Hofmarcher/Theussl SVM 1/21 Linear Separable Separating Hyperplanes Non-Linear Separable Soft-Margin Hyperplanes Hofmarcher/Theussl SVM 2/21 (SVM)

More information

Support Vector Machines for Classification and Regression

Support Vector Machines for Classification and Regression CIS 520: Machine Learning Oct 04, 207 Support Vector Machines for Classification and Regression Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may

More information

Introduction to Machine Learning Spring 2018 Note Duality. 1.1 Primal and Dual Problem

Introduction to Machine Learning Spring 2018 Note Duality. 1.1 Primal and Dual Problem CS 189 Introduction to Machine Learning Spring 2018 Note 22 1 Duality As we have seen in our discussion of kernels, ridge regression can be viewed in two ways: (1) an optimization problem over the weights

More information

Chap 2. Optimality conditions

Chap 2. Optimality conditions Chap 2. Optimality conditions Version: 29-09-2012 2.1 Optimality conditions in unconstrained optimization Recall the definitions of global, local minimizer. Geometry of minimization Consider for f C 1

More information

CE 191: Civil & Environmental Engineering Systems Analysis. LEC 17 : Final Review

CE 191: Civil & Environmental Engineering Systems Analysis. LEC 17 : Final Review CE 191: Civil & Environmental Engineering Systems Analysis LEC 17 : Final Review Professor Scott Moura Civil & Environmental Engineering University of California, Berkeley Fall 2014 Prof. Moura UC Berkeley

More information

Lagrangian Duality and Convex Optimization

Lagrangian Duality and Convex Optimization Lagrangian Duality and Convex Optimization David Rosenberg New York University February 11, 2015 David Rosenberg (New York University) DS-GA 1003 February 11, 2015 1 / 24 Introduction Why Convex Optimization?

More information

Optimization. A first course on mathematics for economists

Optimization. A first course on mathematics for economists Optimization. A first course on mathematics for economists Xavier Martinez-Giralt Universitat Autònoma de Barcelona xavier.martinez.giralt@uab.eu II.3 Static optimization - Non-Linear programming OPT p.1/45

More information

Tutorial on Convex Optimization: Part II

Tutorial on Convex Optimization: Part II Tutorial on Convex Optimization: Part II Dr. Khaled Ardah Communications Research Laboratory TU Ilmenau Dec. 18, 2018 Outline Convex Optimization Review Lagrangian Duality Applications Optimal Power Allocation

More information

Linear and Combinatorial Optimization

Linear and Combinatorial Optimization Linear and Combinatorial Optimization The dual of an LP-problem. Connections between primal and dual. Duality theorems and complementary slack. Philipp Birken (Ctr. for the Math. Sc.) Lecture 3: Duality

More information

Lecture 3. Optimization Problems and Iterative Algorithms

Lecture 3. Optimization Problems and Iterative Algorithms Lecture 3 Optimization Problems and Iterative Algorithms January 13, 2016 This material was jointly developed with Angelia Nedić at UIUC for IE 598ns Outline Special Functions: Linear, Quadratic, Convex

More information

10701 Recitation 5 Duality and SVM. Ahmed Hefny

10701 Recitation 5 Duality and SVM. Ahmed Hefny 10701 Recitation 5 Duality and SVM Ahmed Hefny Outline Langrangian and Duality The Lagrangian Duality Eamples Support Vector Machines Primal Formulation Dual Formulation Soft Margin and Hinge Loss Lagrangian

More information

CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS

CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS CONSTRAINT QUALIFICATIONS, LAGRANGIAN DUALITY & SADDLE POINT OPTIMALITY CONDITIONS A Dissertation Submitted For The Award of the Degree of Master of Philosophy in Mathematics Neelam Patel School of Mathematics

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 4 Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 2 4.1. Subgradients definition subgradient calculus duality and optimality conditions Shiqian

More information

Lecture Note 5: Semidefinite Programming for Stability Analysis

Lecture Note 5: Semidefinite Programming for Stability Analysis ECE7850: Hybrid Systems:Theory and Applications Lecture Note 5: Semidefinite Programming for Stability Analysis Wei Zhang Assistant Professor Department of Electrical and Computer Engineering Ohio State

More information

Machine Learning Techniques

Machine Learning Techniques Machine Learning Techniques ( 機器學習技法 ) Lecture 2: Dual Support Vector Machine Hsuan-Tien Lin ( 林軒田 ) htlin@csie.ntu.edu.tw Department of Computer Science & Information Engineering National Taiwan University

More information

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725 Duality in Linear Programs Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: proximal gradient descent Consider the problem x g(x) + h(x) with g, h convex, g differentiable, and

More information

Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012 Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Linear classifier Which classifier? x 2 x 1 2 Linear classifier Margin concept x 2

More information