Math 5311 Constrained Optimization Notes

Size: px
Start display at page:

Download "Math 5311 Constrained Optimization Notes"

Transcription

1 ath 5311 Constrained Optimization otes February 5, Equality-constrained optimization Real-world optimization problems frequently have constraints on their variables. Constraints may be equality constraints, for example, The total mass of the system must equal 3 or inequality constraints, for example, The total mass of the system must be less than or equal to 3. In this course, we ll restrict ourselves to equality constraints. As always, we re restricting ourselves to variables living in a vector space. So in general, we ll be solving problems of the form min x V f (x) s.t. g(x) = 0 where the objective function (or cost function) f : V R and the constraints g : V U, where U is some other vector space. The abbreviation s.t. is customary for the phrase such that. Usually, we will have dim(u) < dim(v) and g(x) a non-invertible function. Were g(x) invertible, we could solve g(x) = 0 for a unique x and be done: if only one point is consistent with the constraints, then the minimum must be at that point. In other words, to be an interesting optimization problem, the constraints should form an underdetermined system of equations. 1.1 Examples of equality-constrained optimization problems 1. inimize x x2 2 + x 1x 2 2x 1 + 3x 2 over x R 2 subject to x x 2 = 2. Here, the constraint g : R 2 R is g(x) = Ax b where A = 1 3 and b = inimize sin(x 1 x 2 2 ) over x R2 subject to x 2 = cos x 1. This problem is rather easily transformed to the 1D problem: min sin(x cos 2 x) over x R. 3. inimize Wu over u H 1 (, 1) subject to u() = 1, u(1) = 0 where W : H 1 (, 1) R is the functional 1 Wu = 2 u2 x + u dx. We can write the constraint as g(u) = u() 1 u(1) ote that g maps the infinite-dimensional H 1 to the finite-dimensional R 2.. 1

2 4. If we discretize example 3 on the -th order Vandermonde basis, we get the following finite-dimensional minimization problem: minimize W(u) = () i+j u i u j i + j 1 + u i 1 () i i over u R subject to u i = 0 () i u i = inimize 1 2 xt Kx over x R subject to x T x = 1. 2 Solving equality-constrained differentiable optimization problems In calculus you learned the method of Lagrange multipliers for solving constrained optimization problems in R 2 and R 3. Here, we ll develop the Lagrange ultiplier Theorem (LT) in a general vector space setting using Gateaux differentials. To help understand the notation and content of the theorem, I ll first state and prove the LT with a finite number of constraints, then extend it (without proof) to an arbitrary vector space V. 2.1 The Lagrange ultiplier Theorem with a finite number of constraints Theorem 1. Let f, g 1,, g be Frechet differentiable real-valued functions on V. To avoid trivial cases, assume < dim(v), that is, we have fewer constraints than dimensions, and also assume we have no redundant constraints. Let Ω V be the subset of V satisfying the constraints, that is, Ω = {x V g 1 (x) = 0 g 2 (x) = 0 g (x) = 0}. If x is a local minimizer of f in Ω, then there exist {λ} such that This theorem warrants some discussion. D f (x ) + λ i Dg i (x ) = The LT establishes a necessary condition for x to be a minimizer: if x is a minimizer, then certain equations involving x must hold. It does not follow that if the equations hold, x is a minimizer. It may be a maximizer or a saddle point. 2. In R the Frechet derivative is just the dimensional gradient, a vector having components. The set of equations f (x ) + λ i g i (x ) therefore consists of equations in + unknowns ( components of x plus multipliers), so it s underdetermined. The remaining equations are the constraints, g i (x ) = 0, for i = 1 to. You must simultaneously solve the multiplier equations and the constraints. In general, this can be quite difficult. The full system of equations is often called the equality-constrained Karush-Kuhn-Tucker equations, or KKT equations. 3. I ve deliberately left vague what is meant by redundant constraints. As an exercise in your ability to formulate precise mathematical statements of obvious concepts, try to develop a clear and complete definition of redundant constraints. 2

3 2.2 Proof of the LT with a finite number of constraints Proof. Let f : V R and g : V R be Frechet differentiable in some open set S x. We ll refer to the i-th component of g as g i. Because g i is Frechet differentiable at x, it has a unique tangent plane T i at x. Let x satisfy the constraint equations g(x ) = 0. At x the tangent hyperplane T i to the constraint surface g i = 0 is defined as the set of directions h such that d h g i (x ) = 0. For x to be a stationary point of f subject to g(x ) = 0, it must be the case that d h f (x ) = 0 for all vectors h in the tangent hyperplane, that is, for all h such that d h g i (x ) = 0. ow, because we ve stipulated that f is Frechet differentiable, we know that d h f = D f h h V. For the same reason we know that d h g i = Dg i h h V. In other words, both d h f and d h g are linear functions of h. Furthermore, by the requirement for a stationary point we know that for all values of h such that d h g i (x ) = 0, we have d h f (x ) = 0 as well. The only possible linear function satisfying that condition is an arbitrary linear combination of the linear functions involving Dg, that is, This must be true for all h, so the LT follows. D f (x )h = i=0 λ i Dg i (x )h. 2.3 Several useful corollaries Corollary 2. If x is a stationary point of f (x ) subject to the constraint g(x ) = 0, then (x, λ ) is a stationary point of the function L(x, λ) = f (x) + λ T g(x). The proof is straightforward. The function L is usually called the Lagrangian. Warning: in mechanics, there is another function called the Lagrangian, and usually denoted by L; these are not the same functions. Unless otherwise specified, the Lagrangian will always refer to L = f + λ T g, not the function defined in mechanics. Sometimes you ll see the Lagrangian defined as L = f λ T g. The choice of sign is irrelevant to the value of x, and can be chosen as convenient for a given problem. Changing the sign in the Lagrangian will, of course, change the sign of the Lagrange multipliers (unless you also change the sign of g). Corollary 3. If x is a minimizer of f : V R subject to the constraints g : V R, then d h f (x ) + λ i d h g i (x ) = 0 h V. The proof follows immediately from the LT and the definition of the Frechet derivative. 2.4 The LT with infinitely many constraints Theorem 4. Let f : V R and g : V U be Frechet differentiable functions. Let, U be an inner product on the vector space U. Assume dim(u) < dim(v). Let Ω V be the subset of V satisfying the constraints. If x is a local minimizer of f in Ω, then there exist λ U such that D f (x ) + λ, Dg(x ) U = 0. 3 Examples 3.1 Quadratic objective functions, linear constraints Consider optimization in R with linear constraints. Let p(x) = 1 2 xt Kx x T f and g(x) = Ax b. Form the Lagrangian L = 1 2 xt Kx x T f λ T Ax + λ T b. The KKT equations are d (h,µ) L = h T (Kx f + A T λ) + µ T (Ax b) = 0, or in matrix form, 3

4 K A T A 0 x λ The question of when these KKT equations have solutions will be explored in one of your homework problems. = f b. 3.2 Quadratic objective functions, one quadratic constraint Let K be an by matrix. inimize the quadratic form p(x) = 1 2 xt Kx subject to g(x) = x T x 1 = 0. ote that g : R R is a single constraint. The Lagrangian is L = 1 2 xt Kx λ(x T x 1). ote that I ve used a negative sign in the definition of the Lagrangian; this will result in a more conventional form of the equations. Taking differentials of the Lagrangian and setting to zero gives d (h,λ) L = h T (Kx λx) + µ(x T x 1) = 0 h R, µ R. This results in two equations to be solved: Kx = λx and x T x = 1. The first is an eigenvalue problem for eigenvectors x and eigenvalue λ, the second is a normalization condition on the eigenvectors. 3.3 A quadratic functional with boundary conditions inimize Wu = 1 2 u2 x + u dx over u H 1 subject to the constraints u() = 1 and u(1) = 0. Form a Lagrangian L = W + λ 1 (u() 1) + λ 2 u(1). The stationary point is given by the solution to d (v,µ) L = Integration by parts gives u x v x + v dx + λ 1 v() + λ 2 v(1) + µ 1 (u() 1) + µ 2 u(1) v H 1, µ R 2. d (v,µ) L = v 1 u xx dx + (λ 1 + n u())v() + (λ 2 + n u(1))v(1) + µ 1 (u() 1) + µ 2 (u(1)) = 0 for all v H 1, µ R 2. The minimum will be given by the solution to u xx = 1 with boundary conditions u() = 1, u(1) = 0. The multipliers will be λ 1 = n u() and λ 2 = n u(1) Discretization of a quadratic functional with boundary conditions Let s now discretize the same functional using the Vandermonde basis of order 1. The i-th basis function is φ i (x) = x i. Plugging into d (v,µ) L from example 3.3 gives 1 1 u j x i x j dx + x i dx + λ 1 () i + λ 2 = 0 i 1, u j () j = 1 u j = 0. 4

5 Define A 1j = () j, A 2j = 1, K ij = 1 xi x j dx, f i = 1 K A T A 0 xi dx, and b = (1, 0) T. The discrete equations are then u f =. λ b Some advantages to this approach are that we can work with a basis for H 1 (instead of H0 1 ) and we can solve a problem with inhomogeneous boundary conditions. 5

Constrained Optimization

Constrained Optimization 1 / 22 Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University March 30, 2015 2 / 22 1. Equality constraints only 1.1 Reduced gradient 1.2 Lagrange

More information

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications

Optimization Problems with Constraints - introduction to theory, numerical Methods and applications Optimization Problems with Constraints - introduction to theory, numerical Methods and applications Dr. Abebe Geletu Ilmenau University of Technology Department of Simulation and Optimal Processes (SOP)

More information

Introduction to Optimization Techniques. Nonlinear Optimization in Function Spaces

Introduction to Optimization Techniques. Nonlinear Optimization in Function Spaces Introduction to Optimization Techniques Nonlinear Optimization in Function Spaces X : T : Gateaux and Fréchet Differentials Gateaux and Fréchet Differentials a vector space, Y : a normed space transformation

More information

Lecture 18: Optimization Programming

Lecture 18: Optimization Programming Fall, 2016 Outline Unconstrained Optimization 1 Unconstrained Optimization 2 Equality-constrained Optimization Inequality-constrained Optimization Mixture-constrained Optimization 3 Quadratic Programming

More information

More on Lagrange multipliers

More on Lagrange multipliers More on Lagrange multipliers CE 377K April 21, 2015 REVIEW The standard form for a nonlinear optimization problem is min x f (x) s.t. g 1 (x) 0. g l (x) 0 h 1 (x) = 0. h m (x) = 0 The objective function

More information

Generalization to inequality constrained problem. Maximize

Generalization to inequality constrained problem. Maximize Lecture 11. 26 September 2006 Review of Lecture #10: Second order optimality conditions necessary condition, sufficient condition. If the necessary condition is violated the point cannot be a local minimum

More information

Constrained optimization

Constrained optimization Constrained optimization In general, the formulation of constrained optimization is as follows minj(w), subject to H i (w) = 0, i = 1,..., k. where J is the cost function and H i are the constraints. Lagrange

More information

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The

More information

Kernel Methods. Machine Learning A W VO

Kernel Methods. Machine Learning A W VO Kernel Methods Machine Learning A 708.063 07W VO Outline 1. Dual representation 2. The kernel concept 3. Properties of kernels 4. Examples of kernel machines Kernel PCA Support vector regression (Relevance

More information

4TE3/6TE3. Algorithms for. Continuous Optimization

4TE3/6TE3. Algorithms for. Continuous Optimization 4TE3/6TE3 Algorithms for Continuous Optimization (Duality in Nonlinear Optimization ) Tamás TERLAKY Computing and Software McMaster University Hamilton, January 2004 terlaky@mcmaster.ca Tel: 27780 Optimality

More information

Math 164-1: Optimization Instructor: Alpár R. Mészáros

Math 164-1: Optimization Instructor: Alpár R. Mészáros Math 164-1: Optimization Instructor: Alpár R. Mészáros First Midterm, April 20, 2016 Name (use a pen): Student ID (use a pen): Signature (use a pen): Rules: Duration of the exam: 50 minutes. By writing

More information

Computation. For QDA we need to calculate: Lets first consider the case that

Computation. For QDA we need to calculate: Lets first consider the case that Computation For QDA we need to calculate: δ (x) = 1 2 log( Σ ) 1 2 (x µ ) Σ 1 (x µ ) + log(π ) Lets first consider the case that Σ = I,. This is the case where each distribution is spherical, around the

More information

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems

UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems UNDERGROUND LECTURE NOTES 1: Optimality Conditions for Constrained Optimization Problems Robert M. Freund February 2016 c 2016 Massachusetts Institute of Technology. All rights reserved. 1 1 Introduction

More information

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725 Karush-Kuhn-Tucker Conditions Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given a minimization problem Last time: duality min x subject to f(x) h i (x) 0, i = 1,... m l j (x) = 0, j =

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table

More information

Introduction to Support Vector Machines

Introduction to Support Vector Machines Introduction to Support Vector Machines Shivani Agarwal Support Vector Machines (SVMs) Algorithm for learning linear classifiers Motivated by idea of maximizing margin Efficient extension to non-linear

More information

Mathematical Economics. Lecture Notes (in extracts)

Mathematical Economics. Lecture Notes (in extracts) Prof. Dr. Frank Werner Faculty of Mathematics Institute of Mathematical Optimization (IMO) http://math.uni-magdeburg.de/ werner/math-ec-new.html Mathematical Economics Lecture Notes (in extracts) Winter

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

Machine Learning. Support Vector Machines. Manfred Huber

Machine Learning. Support Vector Machines. Manfred Huber Machine Learning Support Vector Machines Manfred Huber 2015 1 Support Vector Machines Both logistic regression and linear discriminant analysis learn a linear discriminant function to separate the data

More information

Examination paper for TMA4180 Optimization I

Examination paper for TMA4180 Optimization I Department of Mathematical Sciences Examination paper for TMA4180 Optimization I Academic contact during examination: Phone: Examination date: 26th May 2016 Examination time (from to): 09:00 13:00 Permitted

More information

Chapter 11. Taylor Series. Josef Leydold Mathematical Methods WS 2018/19 11 Taylor Series 1 / 27

Chapter 11. Taylor Series. Josef Leydold Mathematical Methods WS 2018/19 11 Taylor Series 1 / 27 Chapter 11 Taylor Series Josef Leydold Mathematical Methods WS 2018/19 11 Taylor Series 1 / 27 First-Order Approximation We want to approximate function f by some simple function. Best possible approximation

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Support vector machines (SVMs) are one of the central concepts in all of machine learning. They are simply a combination of two ideas: linear classification via maximum (or optimal

More information

IE 5531: Engineering Optimization I

IE 5531: Engineering Optimization I IE 5531: Engineering Optimization I Lecture 12: Nonlinear optimization, continued Prof. John Gunnar Carlsson October 20, 2010 Prof. John Gunnar Carlsson IE 5531: Engineering Optimization I October 20,

More information

OPTIMALITY AND STABILITY OF SYMMETRIC EVOLUTIONARY GAMES WITH APPLICATIONS IN GENETIC SELECTION. (Communicated by Yang Kuang)

OPTIMALITY AND STABILITY OF SYMMETRIC EVOLUTIONARY GAMES WITH APPLICATIONS IN GENETIC SELECTION. (Communicated by Yang Kuang) MATHEMATICAL BIOSCIENCES doi:10.3934/mbe.2015.12.503 AND ENGINEERING Volume 12, Number 3, June 2015 pp. 503 523 OPTIMALITY AND STABILITY OF SYMMETRIC EVOLUTIONARY GAMES WITH APPLICATIONS IN GENETIC SELECTION

More information

The Karush-Kuhn-Tucker conditions

The Karush-Kuhn-Tucker conditions Chapter 6 The Karush-Kuhn-Tucker conditions 6.1 Introduction In this chapter we derive the first order necessary condition known as Karush-Kuhn-Tucker (KKT) conditions. To this aim we introduce the alternative

More information

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014 Convex Optimization Dani Yogatama School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA February 12, 2014 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12,

More information

Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Introduction to Machine Learning Prof. Sudeshna Sarkar Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Module - 5 Lecture - 22 SVM: The Dual Formulation Good morning.

More information

Gradient Descent. Dr. Xiaowei Huang

Gradient Descent. Dr. Xiaowei Huang Gradient Descent Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Three machine learning algorithms: decision tree learning k-nn linear regression only optimization objectives are discussed,

More information

Numerical Optimization

Numerical Optimization Constrained Optimization Computer Science and Automation Indian Institute of Science Bangalore 560 012, India. NPTEL Course on Constrained Optimization Constrained Optimization Problem: min h j (x) 0,

More information

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30

Optimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30 Optimization Escuela de Ingeniería Informática de Oviedo (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30 Unconstrained optimization Outline 1 Unconstrained optimization 2 Constrained

More information

2 so Q[ 2] is closed under both additive and multiplicative inverses. a 2 2b 2 + b

2 so Q[ 2] is closed under both additive and multiplicative inverses. a 2 2b 2 + b . FINITE-DIMENSIONAL VECTOR SPACES.. Fields By now you ll have acquired a fair knowledge of matrices. These are a concrete embodiment of something rather more abstract. Sometimes it is easier to use matrices,

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized

More information

ICS-E4030 Kernel Methods in Machine Learning

ICS-E4030 Kernel Methods in Machine Learning ICS-E4030 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 28. September, 2016 Juho Rousu 28. September, 2016 1 / 38 Convex optimization Convex optimisation This

More information

Transpose & Dot Product

Transpose & Dot Product Transpose & Dot Product Def: The transpose of an m n matrix A is the n m matrix A T whose columns are the rows of A. So: The columns of A T are the rows of A. The rows of A T are the columns of A. Example:

More information

MATH 4211/6211 Optimization Constrained Optimization

MATH 4211/6211 Optimization Constrained Optimization MATH 4211/6211 Optimization Constrained Optimization Xiaojing Ye Department of Mathematics & Statistics Georgia State University Xiaojing Ye, Math & Stat, Georgia State University 0 Constrained optimization

More information

Optimization using Calculus. Optimization of Functions of Multiple Variables subject to Equality Constraints

Optimization using Calculus. Optimization of Functions of Multiple Variables subject to Equality Constraints Optimization using Calculus Optimization of Functions of Multiple Variables subject to Equality Constraints 1 Objectives Optimization of functions of multiple variables subjected to equality constraints

More information

In view of (31), the second of these is equal to the identity I on E m, while this, in view of (30), implies that the first can be written

In view of (31), the second of these is equal to the identity I on E m, while this, in view of (30), implies that the first can be written 11.8 Inequality Constraints 341 Because by assumption x is a regular point and L x is positive definite on M, it follows that this matrix is nonsingular (see Exercise 11). Thus, by the Implicit Function

More information

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be

More information

Linear and non-linear programming

Linear and non-linear programming Linear and non-linear programming Benjamin Recht March 11, 2005 The Gameplan Constrained Optimization Convexity Duality Applications/Taxonomy 1 Constrained Optimization minimize f(x) subject to g j (x)

More information

Seminars on Mathematics for Economics and Finance Topic 5: Optimization Kuhn-Tucker conditions for problems with inequality constraints 1

Seminars on Mathematics for Economics and Finance Topic 5: Optimization Kuhn-Tucker conditions for problems with inequality constraints 1 Seminars on Mathematics for Economics and Finance Topic 5: Optimization Kuhn-Tucker conditions for problems with inequality constraints 1 Session: 15 Aug 2015 (Mon), 10:00am 1:00pm I. Optimization with

More information

Optimality Conditions

Optimality Conditions Chapter 2 Optimality Conditions 2.1 Global and Local Minima for Unconstrained Problems When a minimization problem does not have any constraints, the problem is to find the minimum of the objective function.

More information

Nonlinear Optimization

Nonlinear Optimization Nonlinear Optimization Etienne de Klerk (UvT)/Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos Course WI3031 (Week 4) February-March, A.D. 2005 Optimization Group 1 Outline

More information

Transpose & Dot Product

Transpose & Dot Product Transpose & Dot Product Def: The transpose of an m n matrix A is the n m matrix A T whose columns are the rows of A. So: The columns of A T are the rows of A. The rows of A T are the columns of A. Example:

More information

Lecture Notes on Support Vector Machine

Lecture Notes on Support Vector Machine Lecture Notes on Support Vector Machine Feng Li fli@sdu.edu.cn Shandong University, China 1 Hyperplane and Margin In a n-dimensional space, a hyper plane is defined by ω T x + b = 0 (1) where ω R n is

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Support Vector Machines Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique

More information

Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)

Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers) Support vector machines In a nutshell Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers) Solution only depends on a small subset of training

More information

Optimality Conditions for Constrained Optimization

Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

More information

Nonlinear Programming and the Kuhn-Tucker Conditions

Nonlinear Programming and the Kuhn-Tucker Conditions Nonlinear Programming and the Kuhn-Tucker Conditions The Kuhn-Tucker (KT) conditions are first-order conditions for constrained optimization problems, a generalization of the first-order conditions we

More information

Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)

Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers) Support vector machines In a nutshell Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers) Solution only depends on a small subset of training

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Sridhar Mahadevan mahadeva@cs.umass.edu University of Massachusetts Sridhar Mahadevan: CMPSCI 689 p. 1/32 Margin Classifiers margin b = 0 Sridhar Mahadevan: CMPSCI 689 p.

More information

MATH529 Fundamentals of Optimization Constrained Optimization I

MATH529 Fundamentals of Optimization Constrained Optimization I MATH529 Fundamentals of Optimization Constrained Optimization I Marco A. Montes de Oca Mathematical Sciences, University of Delaware, USA 1 / 26 Motivating Example 2 / 26 Motivating Example min cost(b)

More information

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725 Duality in Linear Programs Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: proximal gradient descent Consider the problem x g(x) + h(x) with g, h convex, g differentiable, and

More information

Support Vector Machine

Support Vector Machine Andrea Passerini passerini@disi.unitn.it Machine Learning Support vector machines In a nutshell Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)

More information

Lecture 2: Linear SVM in the Dual

Lecture 2: Linear SVM in the Dual Lecture 2: Linear SVM in the Dual Stéphane Canu stephane.canu@litislab.eu São Paulo 2015 July 22, 2015 Road map 1 Linear SVM Optimization in 10 slides Equality constraints Inequality constraints Dual formulation

More information

A Note on Two Different Types of Matrices and Their Applications

A Note on Two Different Types of Matrices and Their Applications A Note on Two Different Types of Matrices and Their Applications Arjun Krishnan I really enjoyed Prof. Del Vecchio s Linear Systems Theory course and thought I d give something back. So I ve written a

More information

CITY UNIVERSITY. London

CITY UNIVERSITY. London 611.51 CITY UNIVERSITY London BSc Honours Degrees in Mathematical Science BSc Honours Degree in Mathematical Science with Finance and Economics BSc Honours Degree in Actuarial Science BSc Honours Degree

More information

Support Vector Machine

Support Vector Machine Support Vector Machine Kernel: Kernel is defined as a function returning the inner product between the images of the two arguments k(x 1, x 2 ) = ϕ(x 1 ), ϕ(x 2 ) k(x 1, x 2 ) = k(x 2, x 1 ) modularity-

More information

REVIEW OF DIFFERENTIAL CALCULUS

REVIEW OF DIFFERENTIAL CALCULUS REVIEW OF DIFFERENTIAL CALCULUS DONU ARAPURA 1. Limits and continuity To simplify the statements, we will often stick to two variables, but everything holds with any number of variables. Let f(x, y) be

More information

KKT Examples. Stanley B. Gershwin Massachusetts Institute of Technology

KKT Examples. Stanley B. Gershwin Massachusetts Institute of Technology Stanley B. Gershwin Massachusetts Institute of Technology The purpose of this note is to supplement the slides that describe the Karush-Kuhn-Tucker conditions. Neither these notes nor the slides are a

More information

Linear & nonlinear classifiers

Linear & nonlinear classifiers Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table

More information

Chapter 7. Optimization and Minimum Principles. 7.1 Two Fundamental Examples. Least Squares

Chapter 7. Optimization and Minimum Principles. 7.1 Two Fundamental Examples. Least Squares Chapter 7 Optimization and Minimum Principles 7 Two Fundamental Examples Within the universe of applied mathematics, optimization is often a world of its own There are occasional expeditions to other worlds

More information

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research Introduction to Machine Learning Lecture 7 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Convex Optimization Differentiation Definition: let f : X R N R be a differentiable function,

More information

Elements of linear algebra

Elements of linear algebra Elements of linear algebra Elements of linear algebra A vector space S is a set (numbers, vectors, functions) which has addition and scalar multiplication defined, so that the linear combination c 1 v

More information

Convex Optimization Boyd & Vandenberghe. 5. Duality

Convex Optimization Boyd & Vandenberghe. 5. Duality 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2. APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product

More information

Chapter 1. Vectors, Matrices, and Linear Spaces

Chapter 1. Vectors, Matrices, and Linear Spaces 1.6 Homogeneous Systems, Subspaces and Bases 1 Chapter 1. Vectors, Matrices, and Linear Spaces 1.6. Homogeneous Systems, Subspaces and Bases Note. In this section we explore the structure of the solution

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 4 Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 2 4.1. Subgradients definition subgradient calculus duality and optimality conditions Shiqian

More information

OR MSc Maths Revision Course

OR MSc Maths Revision Course OR MSc Maths Revision Course Tom Byrne School of Mathematics University of Edinburgh t.m.byrne@sms.ed.ac.uk 15 September 2017 General Information Today JCMB Lecture Theatre A, 09:30-12:30 Mathematics revision

More information

EXAMPLES OF PROOFS BY INDUCTION

EXAMPLES OF PROOFS BY INDUCTION EXAMPLES OF PROOFS BY INDUCTION KEITH CONRAD 1. Introduction In this handout we illustrate proofs by induction from several areas of mathematics: linear algebra, polynomial algebra, and calculus. Becoming

More information

Data Mining. Linear & nonlinear classifiers. Hamid Beigy. Sharif University of Technology. Fall 1396

Data Mining. Linear & nonlinear classifiers. Hamid Beigy. Sharif University of Technology. Fall 1396 Data Mining Linear & nonlinear classifiers Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Data Mining Fall 1396 1 / 31 Table of contents 1 Introduction

More information

March 5, 2012 MATH 408 FINAL EXAM SAMPLE

March 5, 2012 MATH 408 FINAL EXAM SAMPLE March 5, 202 MATH 408 FINAL EXAM SAMPLE Partial Solutions to Sample Questions (in progress) See the sample questions for the midterm exam, but also consider the following questions. Obviously, a final

More information

Convex Optimization & Lagrange Duality

Convex Optimization & Lagrange Duality Convex Optimization & Lagrange Duality Chee Wei Tan CS 8292 : Advanced Topics in Convex Optimization and its Applications Fall 2010 Outline Convex optimization Optimality condition Lagrange duality KKT

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu

More information

Lagrange Multipliers

Lagrange Multipliers Optimization with Constraints As long as algebra and geometry have been separated, their progress have been slow and their uses limited; but when these two sciences have been united, they have lent each

More information

CS-E4830 Kernel Methods in Machine Learning

CS-E4830 Kernel Methods in Machine Learning CS-E4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This

More information

1 Introduction

1 Introduction 2018-06-12 1 Introduction The title of this course is Numerical Methods for Data Science. What does that mean? Before we dive into the course technical material, let s put things into context. I will not

More information

ECE580 Solution to Problem Set 6

ECE580 Solution to Problem Set 6 ECE580 Fall 2015 Solution to Problem Set 6 December 23 2015 1 ECE580 Solution to Problem Set 6 These problems are from the textbook by Chong and Zak 4th edition which is the textbook for the ECE580 Fall

More information

Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)

Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers) Support vector machines In a nutshell Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers) Solution only depends on a small subset of training

More information

Exam in TMA4180 Optimization Theory

Exam in TMA4180 Optimization Theory Norwegian University of Science and Technology Department of Mathematical Sciences Page 1 of 11 Contact during exam: Anne Kværnø: 966384 Exam in TMA418 Optimization Theory Wednesday May 9, 13 Tid: 9. 13.

More information

ON LICQ AND THE UNIQUENESS OF LAGRANGE MULTIPLIERS

ON LICQ AND THE UNIQUENESS OF LAGRANGE MULTIPLIERS ON LICQ AND THE UNIQUENESS OF LAGRANGE MULTIPLIERS GERD WACHSMUTH Abstract. Kyparisis proved in 1985 that a strict version of the Mangasarian- Fromovitz constraint qualification (MFCQ) is equivalent to

More information

Inequality Constraints

Inequality Constraints Chapter 2 Inequality Constraints 2.1 Optimality Conditions Early in multivariate calculus we learn the significance of differentiability in finding minimizers. In this section we begin our study of the

More information

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009

UC Berkeley Department of Electrical Engineering and Computer Science. EECS 227A Nonlinear and Convex Optimization. Solutions 5 Fall 2009 UC Berkeley Department of Electrical Engineering and Computer Science EECS 227A Nonlinear and Convex Optimization Solutions 5 Fall 2009 Reading: Boyd and Vandenberghe, Chapter 5 Solution 5.1 Note that

More information

5. Duality. Lagrangian

5. Duality. Lagrangian 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

Mathematical Foundations -1- Constrained Optimization. Constrained Optimization. An intuitive approach 2. First Order Conditions (FOC) 7

Mathematical Foundations -1- Constrained Optimization. Constrained Optimization. An intuitive approach 2. First Order Conditions (FOC) 7 Mathematical Foundations -- Constrained Optimization Constrained Optimization An intuitive approach First Order Conditions (FOC) 7 Constraint qualifications 9 Formal statement of the FOC for a maximum

More information

Optimality, Duality, Complementarity for Constrained Optimization

Optimality, Duality, Complementarity for Constrained Optimization Optimality, Duality, Complementarity for Constrained Optimization Stephen Wright University of Wisconsin-Madison May 2014 Wright (UW-Madison) Optimality, Duality, Complementarity May 2014 1 / 41 Linear

More information

Math 155 Prerequisite Review Handout

Math 155 Prerequisite Review Handout Math 155 Prerequisite Review Handout August 23, 2010 Contents 1 Basic Mathematical Operations 2 1.1 Examples...................................... 2 1.2 Exercises.......................................

More information

subject to (x 2)(x 4) u,

subject to (x 2)(x 4) u, Exercises Basic definitions 5.1 A simple example. Consider the optimization problem with variable x R. minimize x 2 + 1 subject to (x 2)(x 4) 0, (a) Analysis of primal problem. Give the feasible set, the

More information

Chap 2. Optimality conditions

Chap 2. Optimality conditions Chap 2. Optimality conditions Version: 29-09-2012 2.1 Optimality conditions in unconstrained optimization Recall the definitions of global, local minimizer. Geometry of minimization Consider for f C 1

More information

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010 I.3. LMI DUALITY Didier HENRION henrion@laas.fr EECI Graduate School on Control Supélec - Spring 2010 Primal and dual For primal problem p = inf x g 0 (x) s.t. g i (x) 0 define Lagrangian L(x, z) = g 0

More information

Convex Functions and Optimization

Convex Functions and Optimization Chapter 5 Convex Functions and Optimization 5.1 Convex Functions Our next topic is that of convex functions. Again, we will concentrate on the context of a map f : R n R although the situation can be generalized

More information

Math 10C - Fall Final Exam

Math 10C - Fall Final Exam Math 1C - Fall 217 - Final Exam Problem 1. Consider the function f(x, y) = 1 x 2 (y 1) 2. (i) Draw the level curve through the point P (1, 2). Find the gradient of f at the point P and draw the gradient

More information

Convex Optimization M2

Convex Optimization M2 Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization

More information

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2014 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

WHY DUALITY? Gradient descent Newton s method Quasi-newton Conjugate gradients. No constraints. Non-differentiable ???? Constrained problems? ????

WHY DUALITY? Gradient descent Newton s method Quasi-newton Conjugate gradients. No constraints. Non-differentiable ???? Constrained problems? ???? DUALITY WHY DUALITY? No constraints f(x) Non-differentiable f(x) Gradient descent Newton s method Quasi-newton Conjugate gradients etc???? Constrained problems? f(x) subject to g(x) apple 0???? h(x) =0

More information

Machine Learning Support Vector Machines. Prof. Matteo Matteucci

Machine Learning Support Vector Machines. Prof. Matteo Matteucci Machine Learning Support Vector Machines Prof. Matteo Matteucci Discriminative vs. Generative Approaches 2 o Generative approach: we derived the classifier from some generative hypothesis about the way

More information

Lecture: Duality of LP, SOCP and SDP

Lecture: Duality of LP, SOCP and SDP 1/33 Lecture: Duality of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:

More information

MATH2070 Optimisation

MATH2070 Optimisation MATH2070 Optimisation Nonlinear optimisation with constraints Semester 2, 2012 Lecturer: I.W. Guo Lecture slides courtesy of J.R. Wishart Review The full nonlinear optimisation problem with equality constraints

More information

Numerical Optimization of Partial Differential Equations

Numerical Optimization of Partial Differential Equations Numerical Optimization of Partial Differential Equations Part I: basic optimization concepts in R n Bartosz Protas Department of Mathematics & Statistics McMaster University, Hamilton, Ontario, Canada

More information

Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012 Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Linear classifier Which classifier? x 2 x 1 2 Linear classifier Margin concept x 2

More information