Convexity II: Optimization Basics

Size: px
Start display at page:

Download "Convexity II: Optimization Basics"

Transcription

1 Conveity II: Optimization Basics Lecturer: Ryan Tibshirani Conve Optimization / See supplements for reviews of basic multivariate calculus basic linear algebra

2 Last time: conve sets and functions Conve calculus makes it easy to check conveity. Tools: Definitions of conve sets and functions, classic eamples (, f()) (y, f(y)) Key properties (e.g., first- and second-order characterizations for functions) Operations that preserve conveity (e.g., affine composition) { ( E.g., is ma log 1 (a T + b) 7 ) }, A + b 5 1 conve? 2

3 Outline Today: Optimization terology Properties and first-order optimality Equivalent transformations 3

4 Optimization terology Reder: a conve optimization problem (or program) is D subject to f() g i () 0, i = 1,... m A = b where f and g i, i = 1,... m are all conve, and the optimization domain is D = dom(f) m i=1 dom(g i) (often we do not write D) f is called criterion or objective function g i is called inequality constraint function If D, g i () 0, i = 1,... m, and A = b then is called a feasible point The imum of f() over all feasible points is called the optimal value, written f 4

5 If is feasible and f() = f, then is called optimal; also called a solution, or a imizer 1 If is feasible and f() f + ɛ, then is called ɛ-suboptimal If is feasible and g i () = 0, then we say g i is active at Conve imization can be reposed as concave maimization f() subject to g i () 0, i = 1,... m A = b ma f() subject to g i () 0, i = 1,... m A = b Both are called conve optimization problems 1 Note: a conve optimization problem need not have solutions, i.e., need not attain its imum, but we will not be careful about this 5

6 Conve solution sets Let X opt be the set of all solutions of conve problem, written X opt = arg f() subject to Key property: X opt is a conve set g i () 0, i = 1,... m A = b Proof: use definitions. If, y are solutions, then for 0 t 1, t + (1 t)y D g i (t + (1 t)y) tg i () + (1 t)g i (y) 0 A(t + (1 t)y) = ta + (1 t)ay = b f(t + (1 t)y) tf() + (1 t)f(y) = f Therefore t + (1 t)y is also a solution Another key property: if f is strictly conve, then the solution is unique, i.e., X opt contains one element 6

7 Eample: lasso Given y R n, X R n p, consider the lasso problem: β y Xβ 2 2 subject to β 1 s Is this conve? What is the criterion function? The inequality and equality constraints? Feasible set? Is the solution unique, when: n p and X has full column rank? p > n ( high-dimensional case)? How do our answers change if we changed criterion to Huber loss: { n 1 ρ(y i T i β), ρ(z) = 2 z2 z δ δ z 1 2 δ2 else i=1? 7

8 Eample: support vector machines Given y { 1, 1} n, X R n p with rows 1,... n, consider the support vector machine or SVM problem: β,β 0,ξ subject to 1 2 β C n i=1 ξ i ξ i 0, i = 1,... n y i ( T i β + β 0 ) 1 ξ i, i = 1,... n Is this conve? What is the criterion, constraints, feasible set? Is the solution (β, β 0, ξ) unique? What if changed the criterion to 1 2 β β2 0 + C n i=1 ξ 1.01 i? For original criterion, what about β component, at the solution? 8

9 Local ima are global ima For a conve problem, a feasible point is called locally optimal is there is some R > 0 such that f() f(y) for all feasible y such that y 2 R Reder: for conve optimization problems, local optima are global optima Proof simply follows from definitions Conve Nonconve 9

10 10 The optimization problem Rewriting constraints subject to f() g i () 0, i = 1,... m A = b can be rewritten as f() subject to C where C = { : g i () 0, i = 1,... m, A = b}, the feasible set. Hence the above formulation is completely general With I C the indicator of C, we can write this in unconstrained form f() + I C ()

11 11 For a conve problem First-order optimality condition f() subject to C and differentiable f, a feasible point is optimal if and only if f() T (y ) 0 for all y C This is called the first-order condition for optimality In words: all feasible directions from are aligned with gradient f() Important special case: if C = R n (unconstrained optimization), then optimality condition reduces to familiar f() = 0

12 12 Eample: quadratic imization Consider imizing the quadratic function f() = 1 2 T Q + b T + c where Q 0. The first-order condition says that solution satisfies Cases: f() = Q + b = 0 if Q 0, then there is a unique solution = Q 1 b if Q is singular and b / col(q), then there is no solution (i.e., f() = ) if Q is singular and b col(q), then there are infinitely many solutions = Q + b + z, z null(q) where Q + is the pseudoinverse of Q

13 13 Eample: equality-constrained imization Consider the equality-constrained conve problem: f() subject to A = b with f differentiable. Let s prove Lagrange multiplier optimality condition f() + A T u = 0 for some u According to first-order optimality, solution satisfies A = b and f() T (y ) 0 for all y such that Ay = b This is equivalent to f() T v = 0 for all v null(a) Result follows because null(a) = row(a)

14 14 Eample: projection onto a conve set Consider projection onto conve set C: a 2 2 subject to C First-order optimality condition says that the solution satisfies f() T (y ) = ( a) T (y ) 0 for all y C Equivalently, this says that a N C () where recall N C () is the normal cone to C at

15 15 Partial optimization Reder: g() = y C f(, y) is conve in, provided that f is conve in (, y) and C is a conve set Therefore we can always partially optimize a conve problem and retain conveity E.g., if we decompose = ( 1, 2 ) R n 1+n 2, then f( 1, 2 ) 1, 2 subject to g 1 ( 1 ) 0 g 2 ( 2 ) 0 1 f(1 ) subject to g 1 ( 1 ) 0 where f( 1 ) = {f( 1, 2 ) : g 2 ( 2 ) 0}. The right problem is conve if the left problem is

16 16 Recall the SVM problem Eample: hinge form of SVMs β,β 0,ξ subject to 1 2 β C n i=1 ξ i ξ i 0, y i ( T i β + β 0 ) 1 ξ i, i = 1,... n Rewrite the constraints as ξ i ma{0, 1 y i ( T i β + β 0)}. Indeed we can argue that we have = at solution Therefore plugging in for optimal ξ gives the hinge form of SVMs: 1 β,β 0 2 β C n [ 1 yi ( T i β + β 0 ) ] + i=1 where a + = ma{0, a} is called the hinge function

17 17 Transformations and change of variables If h : R R is a monotone increasing transformation, then f() subject to C h(f()) subject to C Similarly, inequality or equality constraints can be transformed and yield equivalent optimization problems. Can use this to reveal the hidden conveity of a problem If φ : R n R m is one-to-one, and its image covers feasible set C, then we can change variables in an optimization problem: y f() subject to C f(φ(y)) subject to φ(y) C

18 Eample: geometric programg A monomial is a function f : R n ++ R of the form f() = γ a 1 1 a 2 2 an n for γ > 0, a 1,... a n R. A posynomial is a sum of monomials, f() = p k=1 A geometric program is of the form γ k a k1 1 a k2 2 a kn n subject to f() g i () 1, i = 1,... m h j () = 1, j = 1,... r where f, g i, i = 1,... m are posynomials and h j, j = 1,... r are monomials. This is nonconve 18

19 19 Let s prove that a geometric program is equivalent to a conve one. Given f() = γ a 1 1 a 2 2 an n, let y i = log i and rewrite this as γ(e y 1 ) a 1 (e y 2 ) a2 (e yn ) an = e at y+b for b = log γ. Also, a posynomial can be written as p k=1 eat k y+b k. With this variable substitution, and after taking logs, a geometric program is equivalent to ) subject to log log ( p0 e at 0k y+b 0k k=1 ( pi e at ik y+b ik k=1 ) c T j y + d j = 0, j = 1,... r 0, i = 1,... m This is conve, recalling the conveity of soft ma functions

20 20 Many interesting problems are geometric programs, e.g., floor planning: w i ( i,y i ) C i h i H W See Boyd et al. (2007), A tutorial on geometric programg, and also Chapter 8.8 of B & V book

21 21 Eample floor planning program: W,H,,y,w,h subject to W H 0 i W, i = 1,... n 0 y i H, i = 1,... n i + w i j, (i, j) L y i + h i y j, (i, j) B w i h i = C i, i = 1,... n. Check: why is this a geometric program?

22 22 Eliating equality constraints Important special case of change of variables: eliating equality constraints. Given the problem subject to f() g i () 0, i = 1,... m A = b we can always epress any feasible point as = My + 0, where A 0 = b and col(m) = null(a). Hence the above is equivalent to y f(my + 0 ) subject to g i (My + 0 ) 0, i = 1,... m Note: this is fully general but not always a good idea (practically)

23 23 Introducing slack variables Essentially opposite to eliating equality contraints: introducing slack variables. Given the problem subject to f() g i () 0, i = 1,... m A = b we can transform the inequality constraints via,s subject to f() s i 0, i = 1,... m g i () + s i = 0, i = 1,... m A = b Note: this is no longer conve unless g i, i = 1,..., n are affine

24 24 Relaing nonaffine equality constraints Given an optimization problem f() subject to C we can always take an enlarged constraint set C C and consider f() subject to C This is called a relaation and its optimal value is always smaller or equal to that of the original problem Important special case: relaing nonaffine equality constraints, i.e., h j () = 0, j = 1,... r where h j, j = 1,... r are conve but nonaffine, are replaced with h j () 0, j = 1,... r

25 25 Eample: maimum utility problem The maimum utility problem models investment/consumption: ma,b subject to T α t u( t ) t=1 b t+1 = b t + f(b t ) t, t = 1,... T 0 t b t, t = 1,... T Here b t is the budget and t is the amount consumed at time t; f is an investment return function, u utility function, both concave and increasing Is this a conve problem? What if we replace equality constraints with inequalities: b t+1 b t + f(b t ) t, t = 1,... T?

26 26 Eample: principal components analysis Given X R n p, consider the low rank approimation problem: R X R 2 F subject to rank(r) = k Here A 2 F = n i=1 p j=1 A2 ij, the entrywise squared l 2 norm, and rank(a) denotes the rank of A Also called principal components analysis or PCA problem. Given X = UDV T, singular value decomposition or SVD, the solution is R = U k D k V T k where U k, V k are the first k columns of U, V and D k is the first k diagonal elements of D. I.e., R is reconstruction of X from its first k principal components This problem is not conve. Why?

27 27 We can recast the PCA problem in a conve form. First rewrite as X Z S XZ 2 p F subject to rank(z) = k, Z is a projection ma Z S p tr(sz) subject to rank(z) = k, Z is a projection where S = X T X. Hence constraint set is the nonconve set { C = Z S p : λ i (Z) {0, 1}, i = 1,... p, tr(z) = k} where λ i (Z), i = 1,... n are the eigenvalues of Z. Solution in this formulation is Z = V k V T k where V k gives first k columns of V

28 28 Now consider relaing constraint set to F k = conv(c), its conve hull. Note F k = {Z S p : λ i (Z) [0, 1], i = 1,... p, tr(z) = k} = {Z S p : 0 Z I, tr(z) = k} Recall this is called the Fantope of order k Hence, the linear maimization over the Fantope, namely ma Z F k tr(sz) is conve. Remarkably, this is equivalent to the nonconve PCA problem (admits the same solution)! (Famous result: Fan (1949), On a theorem of Weyl conerning eigenvalues of linear transformations )

29 29 References and further reading S. Boyd and L. Vandenberghe (2004), Conve optimization, Chapter 4 O. Guler (2010), Foundations of optimization, Chapter 4

Lecture 4: September 12

Lecture 4: September 12 10-725/36-725: Conve Optimization Fall 2016 Lecture 4: September 12 Lecturer: Ryan Tibshirani Scribes: Jay Hennig, Yifeng Tao, Sriram Vasudevan Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:

More information

Lecture 1: January 12

Lecture 1: January 12 10-725/36-725: Convex Optimization Fall 2015 Lecturer: Ryan Tibshirani Lecture 1: January 12 Scribes: Seo-Jin Bang, Prabhat KC, Josue Orellana 1.1 Review We begin by going through some examples and key

More information

Duality Uses and Correspondences. Ryan Tibshirani Convex Optimization

Duality Uses and Correspondences. Ryan Tibshirani Convex Optimization Duality Uses and Correspondences Ryan Tibshirani Conve Optimization 10-725 Recall that for the problem Last time: KKT conditions subject to f() h i () 0, i = 1,... m l j () = 0, j = 1,... r the KKT conditions

More information

Lecture 26: April 22nd

Lecture 26: April 22nd 10-725/36-725: Conve Optimization Spring 2015 Lecture 26: April 22nd Lecturer: Ryan Tibshirani Scribes: Eric Wong, Jerzy Wieczorek, Pengcheng Zhou Note: LaTeX template courtesy of UC Berkeley EECS dept.

More information

Lecture 23: November 19

Lecture 23: November 19 10-725/36-725: Conve Optimization Fall 2018 Lecturer: Ryan Tibshirani Lecture 23: November 19 Scribes: Charvi Rastogi, George Stoica, Shuo Li Charvi Rastogi: 23.1-23.4.2, George Stoica: 23.4.3-23.8, Shuo

More information

LECTURE 7. Least Squares and Variants. Optimization Models EE 127 / EE 227AT. Outline. Least Squares. Notes. Notes. Notes. Notes.

LECTURE 7. Least Squares and Variants. Optimization Models EE 127 / EE 227AT. Outline. Least Squares. Notes. Notes. Notes. Notes. Optimization Models EE 127 / EE 227AT Laurent El Ghaoui EECS department UC Berkeley Spring 2015 Sp 15 1 / 23 LECTURE 7 Least Squares and Variants If others would but reflect on mathematical truths as deeply

More information

Lecture 10. ( x domf. Remark 1 The domain of the conjugate function is given by

Lecture 10. ( x domf. Remark 1 The domain of the conjugate function is given by 10-1 Multi-User Information Theory Jan 17, 2012 Lecture 10 Lecturer:Dr. Haim Permuter Scribe: Wasim Huleihel I. CONJUGATE FUNCTION In the previous lectures, we discussed about conve set, conve functions

More information

Nonconvex? NP! (No Problem!) Ryan Tibshirani Convex Optimization /36-725

Nonconvex? NP! (No Problem!) Ryan Tibshirani Convex Optimization /36-725 Nonconvex? NP! (No Problem!) Ryan Tibshirani Convex Optimization 10-725/36-725 1 Beyond the tip? 2 Some takeaway points If possible, formulate task in terms of convex optimization typically easier to solve,

More information

Canonical Problem Forms. Ryan Tibshirani Convex Optimization

Canonical Problem Forms. Ryan Tibshirani Convex Optimization Canonical Problem Forms Ryan Tibshirani Convex Optimization 10-725 Last time: optimization basics Optimization terology (e.g., criterion, constraints, feasible points, solutions) Properties and first-order

More information

Lecture 23: Conditional Gradient Method

Lecture 23: Conditional Gradient Method 10-725/36-725: Conve Optimization Spring 2015 Lecture 23: Conditional Gradient Method Lecturer: Ryan Tibshirani Scribes: Shichao Yang,Diyi Yang,Zhanpeng Fang Note: LaTeX template courtesy of UC Berkeley

More information

Nonconvex? NP! (No Problem!) Ryan Tibshirani Convex Optimization

Nonconvex? NP! (No Problem!) Ryan Tibshirani Convex Optimization Nonconvex? NP! (No Problem!) Ryan Tibshirani Convex Optimization 10-725 1 Outline Today: Convex versus nonconvex? Classical nonconvex problems Eigen problems Graph problems Nonconvex proximal operators

More information

CS-E4830 Kernel Methods in Machine Learning

CS-E4830 Kernel Methods in Machine Learning CS-E4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This

More information

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725 Karush-Kuhn-Tucker Conditions Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given a minimization problem Last time: duality min x subject to f(x) h i (x) 0, i = 1,... m l j (x) = 0, j =

More information

Intro to Nonlinear Optimization

Intro to Nonlinear Optimization Intro to Nonlinear Optimization We now rela the proportionality and additivity assumptions of LP What are the challenges of nonlinear programs NLP s? Objectives and constraints can use any function: ma

More information

10701 Recitation 5 Duality and SVM. Ahmed Hefny

10701 Recitation 5 Duality and SVM. Ahmed Hefny 10701 Recitation 5 Duality and SVM Ahmed Hefny Outline Langrangian and Duality The Lagrangian Duality Eamples Support Vector Machines Primal Formulation Dual Formulation Soft Margin and Hinge Loss Lagrangian

More information

Introduction to Machine Learning Spring 2018 Note Duality. 1.1 Primal and Dual Problem

Introduction to Machine Learning Spring 2018 Note Duality. 1.1 Primal and Dual Problem CS 189 Introduction to Machine Learning Spring 2018 Note 22 1 Duality As we have seen in our discussion of kernels, ridge regression can be viewed in two ways: (1) an optimization problem over the weights

More information

ICS-E4030 Kernel Methods in Machine Learning

ICS-E4030 Kernel Methods in Machine Learning ICS-E4030 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 28. September, 2016 Juho Rousu 28. September, 2016 1 / 38 Convex optimization Convex optimisation This

More information

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725 Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:

More information

Lagrangian Duality for Dummies

Lagrangian Duality for Dummies Lagrangian Duality for Dummies David Knowles November 13, 2010 We want to solve the following optimisation problem: f 0 () (1) such that f i () 0 i 1,..., m (2) For now we do not need to assume conveity.

More information

The Lagrangian L : R d R m R r R is an (easier to optimize) lower bound on the original problem:

The Lagrangian L : R d R m R r R is an (easier to optimize) lower bound on the original problem: HT05: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford Convex Optimization and slides based on Arthur Gretton s Advanced Topics in Machine Learning course

More information

10-725/36-725: Convex Optimization Spring Lecture 21: April 6

10-725/36-725: Convex Optimization Spring Lecture 21: April 6 10-725/36-725: Conve Optimization Spring 2015 Lecturer: Ryan Tibshirani Lecture 21: April 6 Scribes: Chiqun Zhang, Hanqi Cheng, Waleed Ammar Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:

More information

1 Kernel methods & optimization

1 Kernel methods & optimization Machine Learning Class Notes 9-26-13 Prof. David Sontag 1 Kernel methods & optimization One eample of a kernel that is frequently used in practice and which allows for highly non-linear discriminant functions

More information

Nonconvex? NP! (No Problem!) Lecturer: Ryan Tibshirani Convex Optimization /36-725

Nonconvex? NP! (No Problem!) Lecturer: Ryan Tibshirani Convex Optimization /36-725 Nonconvex? NP! (No Problem!) Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: integer programg Given convex function f, convex set C, J {1,... n}, an integer program is a problem

More information

Lecture 7: Weak Duality

Lecture 7: Weak Duality EE 227A: Conve Optimization and Applications February 7, 2012 Lecture 7: Weak Duality Lecturer: Laurent El Ghaoui 7.1 Lagrange Dual problem 7.1.1 Primal problem In this section, we consider a possibly

More information

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Duality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725 Duality in Linear Programs Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: proximal gradient descent Consider the problem x g(x) + h(x) with g, h convex, g differentiable, and

More information

Lecture 7: September 17

Lecture 7: September 17 10-725: Optimization Fall 2013 Lecture 7: September 17 Lecturer: Ryan Tibshirani Scribes: Serim Park,Yiming Gu 7.1 Recap. The drawbacks of Gradient Methods are: (1) requires f is differentiable; (2) relatively

More information

9. Interpretations, Lifting, SOS and Moments

9. Interpretations, Lifting, SOS and Moments 9-1 Interpretations, Lifting, SOS and Moments P. Parrilo and S. Lall, CDC 2003 2003.12.07.04 9. Interpretations, Lifting, SOS and Moments Polynomial nonnegativity Sum of squares (SOS) decomposition Eample

More information

Convex Optimization. 4. Convex Optimization Problems. Prof. Ying Cui. Department of Electrical Engineering Shanghai Jiao Tong University

Convex Optimization. 4. Convex Optimization Problems. Prof. Ying Cui. Department of Electrical Engineering Shanghai Jiao Tong University Conve Optimization 4. Conve Optimization Problems Prof. Ying Cui Department of Electrical Engineering Shanghai Jiao Tong University 2017 Autumn Semester SJTU Ying Cui 1 / 58 Outline Optimization problems

More information

Lecture 10: Duality in Linear Programs

Lecture 10: Duality in Linear Programs 10-725/36-725: Convex Optimization Spring 2015 Lecture 10: Duality in Linear Programs Lecturer: Ryan Tibshirani Scribes: Jingkun Gao and Ying Zhang Disclaimer: These notes have not been subjected to the

More information

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Lecture 15 Newton Method and Self-Concordance. October 23, 2008 Newton Method and Self-Concordance October 23, 2008 Outline Lecture 15 Self-concordance Notion Self-concordant Functions Operations Preserving Self-concordance Properties of Self-concordant Functions Implications

More information

Introduction to Alternating Direction Method of Multipliers

Introduction to Alternating Direction Method of Multipliers Introduction to Alternating Direction Method of Multipliers Yale Chang Machine Learning Group Meeting September 29, 2016 Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction

More information

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST) Lagrange Duality Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline of Lecture Lagrangian Dual function Dual

More information

Properties of Matrices and Operations on Matrices

Properties of Matrices and Operations on Matrices Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,

More information

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization

Frank-Wolfe Method. Ryan Tibshirani Convex Optimization Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)

More information

Duality revisited. Javier Peña Convex Optimization /36-725

Duality revisited. Javier Peña Convex Optimization /36-725 Duality revisited Javier Peña Conve Optimization 10-725/36-725 1 Last time: barrier method Main idea: approimate the problem f() + I C () with the barrier problem f() + 1 t φ() tf() + φ() where t > 0 and

More information

Convex Optimization Overview (cnt d)

Convex Optimization Overview (cnt d) Conve Optimization Overview (cnt d) Chuong B. Do November 29, 2009 During last week s section, we began our study of conve optimization, the study of mathematical optimization problems of the form, minimize

More information

Subgradient Method. Ryan Tibshirani Convex Optimization

Subgradient Method. Ryan Tibshirani Convex Optimization Subgradient Method Ryan Tibshirani Convex Optimization 10-725 Consider the problem Last last time: gradient descent min x f(x) for f convex and differentiable, dom(f) = R n. Gradient descent: choose initial

More information

Lecture 16: October 22

Lecture 16: October 22 0-725/36-725: Conve Optimization Fall 208 Lecturer: Ryan Tibshirani Lecture 6: October 22 Scribes: Nic Dalmasso, Alan Mishler, Benja LeRoy Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:

More information

Dual Methods. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Dual Methods. Lecturer: Ryan Tibshirani Convex Optimization /36-725 Dual Methods Lecturer: Ryan Tibshirani Conve Optimization 10-725/36-725 1 Last time: proimal Newton method Consider the problem min g() + h() where g, h are conve, g is twice differentiable, and h is simple.

More information

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil

GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis Massimiliano Pontil 1 Today s plan SVD and principal component analysis (PCA) Connection

More information

STATIC LECTURE 4: CONSTRAINED OPTIMIZATION II - KUHN TUCKER THEORY

STATIC LECTURE 4: CONSTRAINED OPTIMIZATION II - KUHN TUCKER THEORY STATIC LECTURE 4: CONSTRAINED OPTIMIZATION II - KUHN TUCKER THEORY UNIVERSITY OF MARYLAND: ECON 600 1. Some Eamples 1 A general problem that arises countless times in economics takes the form: (Verbally):

More information

4. Convex optimization problems

4. Convex optimization problems Convex Optimization Boyd & Vandenberghe 4. Convex optimization problems optimization problem in standard form convex optimization problems quasiconvex optimization linear optimization quadratic optimization

More information

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725 Dual methods and ADMM Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given f : R n R, the function is called its conjugate Recall conjugate functions f (y) = max x R n yt x f(x)

More information

Geometric Modeling Summer Semester 2010 Mathematical Tools (1)

Geometric Modeling Summer Semester 2010 Mathematical Tools (1) Geometric Modeling Summer Semester 2010 Mathematical Tools (1) Recap: Linear Algebra Today... Topics: Mathematical Background Linear algebra Analysis & differential geometry Numerical techniques Geometric

More information

Statistical Geometry Processing Winter Semester 2011/2012

Statistical Geometry Processing Winter Semester 2011/2012 Statistical Geometry Processing Winter Semester 2011/2012 Linear Algebra, Function Spaces & Inverse Problems Vector and Function Spaces 3 Vectors vectors are arrows in space classically: 2 or 3 dim. Euclidian

More information

Conditional Gradient (Frank-Wolfe) Method

Conditional Gradient (Frank-Wolfe) Method Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties

More information

Matrix Vector Products

Matrix Vector Products We covered these notes in the tutorial sessions I strongly recommend that you further read the presented materials in classical books on linear algebra Please make sure that you understand the proofs and

More information

Lecture: Convex Optimization Problems

Lecture: Convex Optimization Problems 1/36 Lecture: Convex Optimization Problems http://bicmr.pku.edu.cn/~wenzw/opt-2015-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghe s lecture notes Introduction 2/36 optimization

More information

Lecture Notes on Support Vector Machine

Lecture Notes on Support Vector Machine Lecture Notes on Support Vector Machine Feng Li fli@sdu.edu.cn Shandong University, China 1 Hyperplane and Margin In a n-dimensional space, a hyper plane is defined by ω T x + b = 0 (1) where ω R n is

More information

Convex Optimization and Modeling

Convex Optimization and Modeling Convex Optimization and Modeling Introduction and a quick repetition of analysis/linear algebra First lecture, 12.04.2010 Jun.-Prof. Matthias Hein Organization of the lecture Advanced course, 2+2 hours,

More information

Convex Optimization Problems. Prof. Daniel P. Palomar

Convex Optimization Problems. Prof. Daniel P. Palomar Conve Optimization Problems Prof. Daniel P. Palomar The Hong Kong University of Science and Technology (HKUST) MAFS6010R- Portfolio Optimization with R MSc in Financial Mathematics Fall 2018-19, HKUST,

More information

Convex Functions and Optimization

Convex Functions and Optimization Chapter 5 Convex Functions and Optimization 5.1 Convex Functions Our next topic is that of convex functions. Again, we will concentrate on the context of a map f : R n R although the situation can be generalized

More information

On Convergence Rate of Concave-Convex Procedure

On Convergence Rate of Concave-Convex Procedure On Convergence Rate of Concave-Conve Procedure Ian E.H. Yen r00922017@csie.ntu.edu.tw Po-Wei Wang b97058@csie.ntu.edu.tw Nanyun Peng Johns Hopkins University Baltimore, MD 21218 npeng1@jhu.edu Shou-De

More information

Lecture 8: February 9

Lecture 8: February 9 0-725/36-725: Convex Optimiation Spring 205 Lecturer: Ryan Tibshirani Lecture 8: February 9 Scribes: Kartikeya Bhardwaj, Sangwon Hyun, Irina Caan 8 Proximal Gradient Descent In the previous lecture, we

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

10-725/36-725: Convex Optimization Prerequisite Topics

10-725/36-725: Convex Optimization Prerequisite Topics 10-725/36-725: Convex Optimization Prerequisite Topics February 3, 2015 This is meant to be a brief, informal refresher of some topics that will form building blocks in this course. The content of the

More information

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014 Convex Optimization Dani Yogatama School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA February 12, 2014 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12,

More information

Existence of minimizers

Existence of minimizers Existence of imizers We have just talked a lot about how to find the imizer of an unconstrained convex optimization problem. We have not talked too much, at least not in concrete mathematical terms, about

More information

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The

More information

Math 273a: Optimization Subgradients of convex functions

Math 273a: Optimization Subgradients of convex functions Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 42 Subgradients Assumptions

More information

Lecture 13: Duality Uses and Correspondences

Lecture 13: Duality Uses and Correspondences 10-725/36-725: Conve Optimization Fall 2016 Lectre 13: Dality Uses and Correspondences Lectrer: Ryan Tibshirani Scribes: Yichong X, Yany Liang, Yanning Li Note: LaTeX template cortesy of UC Berkeley EECS

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2

MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2 MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2 1 Ridge Regression Ridge regression and the Lasso are two forms of regularized

More information

Least Squares Optimization

Least Squares Optimization Least Squares Optimization The following is a brief review of least squares optimization and constrained optimization techniques. I assume the reader is familiar with basic linear algebra, including the

More information

Duality. Geoff Gordon & Ryan Tibshirani Optimization /

Duality. Geoff Gordon & Ryan Tibshirani Optimization / Duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Duality in linear programs Suppose we want to find lower bound on the optimal value in our convex problem, B min x C f(x) E.g., consider

More information

A GENERAL FORMULATION FOR SUPPORT VECTOR MACHINES. Wei Chu, S. Sathiya Keerthi, Chong Jin Ong

A GENERAL FORMULATION FOR SUPPORT VECTOR MACHINES. Wei Chu, S. Sathiya Keerthi, Chong Jin Ong A GENERAL FORMULATION FOR SUPPORT VECTOR MACHINES Wei Chu, S. Sathiya Keerthi, Chong Jin Ong Control Division, Department of Mechanical Engineering, National University of Singapore 0 Kent Ridge Crescent,

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course otes for EE7C (Spring 018): Conve Optimization and Approimation Instructor: Moritz Hardt Email: hardt+ee7c@berkeley.edu Graduate Instructor: Ma Simchowitz Email: msimchow+ee7c@berkeley.edu October

More information

Convex optimization problems. Optimization problem in standard form

Convex optimization problems. Optimization problem in standard form Convex optimization problems optimization problem in standard form convex optimization problems linear optimization quadratic optimization geometric programming quasiconvex optimization generalized inequality

More information

Midterm 1 Solutions. 1. (2 points) Show that the Frobenius norm of a matrix A depends only on its singular values. Precisely, show that

Midterm 1 Solutions. 1. (2 points) Show that the Frobenius norm of a matrix A depends only on its singular values. Precisely, show that EE127A L. El Ghaoui YOUR NAME HERE: SOLUTIONS YOUR SID HERE: 42 3/27/9 Midterm 1 Solutions The eam is open notes, but access to the Internet is not allowed. The maimum grade is 2. When asked to prove something,

More information

Least Squares Optimization

Least Squares Optimization Least Squares Optimization The following is a brief review of least squares optimization and constrained optimization techniques. Broadly, these techniques can be used in data analysis and visualization

More information

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17 EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 17 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory May 29, 2012 Andre Tkacenko

More information

Dual Ascent. Ryan Tibshirani Convex Optimization

Dual Ascent. Ryan Tibshirani Convex Optimization Dual Ascent Ryan Tibshirani Conve Optimization 10-725 Last time: coordinate descent Consider the problem min f() where f() = g() + n i=1 h i( i ), with g conve and differentiable and each h i conve. Coordinate

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Support vector machines (SVMs) are one of the central concepts in all of machine learning. They are simply a combination of two ideas: linear classification via maximum (or optimal

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

In applications, we encounter many constrained optimization problems. Examples Basis pursuit: exact sparse recovery problem

In applications, we encounter many constrained optimization problems. Examples Basis pursuit: exact sparse recovery problem 1 Conve Analsis Main references: Vandenberghe UCLA): EECS236C - Optimiation methods for large scale sstems, http://www.seas.ucla.edu/ vandenbe/ee236c.html Parikh and Bod, Proimal algorithms, slides and

More information

ELEG5481 SIGNAL PROCESSING OPTIMIZATION TECHNIQUES 6. GEOMETRIC PROGRAM

ELEG5481 SIGNAL PROCESSING OPTIMIZATION TECHNIQUES 6. GEOMETRIC PROGRAM ELEG5481 SIGNAL PROCESSING OPTIMIZATION TECHNIQUES 6. GEOMETRIC PROGRAM Wing-Kin Ma, Dept. Electronic Eng., The Chinese University of Hong Kong 1 Some Basics A function f : R n R with domf = R n ++, defined

More information

Review of Linear Algebra

Review of Linear Algebra Review of Linear Algebra Dr Gerhard Roth COMP 40A Winter 05 Version Linear algebra Is an important area of mathematics It is the basis of computer vision Is very widely taught, and there are many resources

More information

Lecture 1 Systems of Linear Equations and Matrices

Lecture 1 Systems of Linear Equations and Matrices Lecture 1 Systems of Linear Equations and Matrices Math 19620 Outline of Course Linear Equations and Matrices Linear Transformations, Inverses Bases, Linear Independence, Subspaces Abstract Vector Spaces

More information

Homework 4. Convex Optimization /36-725

Homework 4. Convex Optimization /36-725 Homework 4 Convex Optimization 10-725/36-725 Due Friday November 4 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)

More information

Optimization Methods: Optimization using Calculus - Equality constraints 1. Module 2 Lecture Notes 4

Optimization Methods: Optimization using Calculus - Equality constraints 1. Module 2 Lecture Notes 4 Optimization Methods: Optimization using Calculus - Equality constraints Module Lecture Notes 4 Optimization of Functions of Multiple Variables subect to Equality Constraints Introduction In the previous

More information

Newton s Method. Ryan Tibshirani Convex Optimization /36-725

Newton s Method. Ryan Tibshirani Convex Optimization /36-725 Newton s Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, Properties and examples: f (y) = max x

More information

Linear, threshold units. Linear Discriminant Functions and Support Vector Machines. Biometrics CSE 190 Lecture 11. X i : inputs W i : weights

Linear, threshold units. Linear Discriminant Functions and Support Vector Machines. Biometrics CSE 190 Lecture 11. X i : inputs W i : weights Linear Discriminant Functions and Support Vector Machines Linear, threshold units CSE19, Winter 11 Biometrics CSE 19 Lecture 11 1 X i : inputs W i : weights θ : threshold 3 4 5 1 6 7 Courtesy of University

More information

Lecture: Duality of LP, SOCP and SDP

Lecture: Duality of LP, SOCP and SDP 1/33 Lecture: Duality of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:

More information

1. Sets A set is any collection of elements. Examples: - the set of even numbers between zero and the set of colors on the national flag.

1. Sets A set is any collection of elements. Examples: - the set of even numbers between zero and the set of colors on the national flag. San Francisco State University Math Review Notes Michael Bar Sets A set is any collection of elements Eamples: a A {,,4,6,8,} - the set of even numbers between zero and b B { red, white, bule} - the set

More information

B553 Lecture 5: Matrix Algebra Review

B553 Lecture 5: Matrix Algebra Review B553 Lecture 5: Matrix Algebra Review Kris Hauser January 19, 2012 We have seen in prior lectures how vectors represent points in R n and gradients of functions. Matrices represent linear transformations

More information

Lecture 4: Convex Functions, Part I February 1

Lecture 4: Convex Functions, Part I February 1 IE 521: Convex Optimization Instructor: Niao He Lecture 4: Convex Functions, Part I February 1 Spring 2017, UIUC Scribe: Shuanglong Wang Courtesy warning: These notes do not necessarily cover everything

More information

Learning the Kernel Matrix with Semi-Definite Programming

Learning the Kernel Matrix with Semi-Definite Programming Learning the Kernel Matrix with Semi-Definite Programg Gert R.G. Lanckriet gert@cs.berkeley.edu Department of Electrical Engineering and Computer Science University of California, Berkeley, CA 94720, USA

More information

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra & Geometry why is linear algebra useful in computer vision? Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia

More information

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research Introduction to Machine Learning Lecture 7 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Convex Optimization Differentiation Definition: let f : X R N R be a differentiable function,

More information

Review of Optimization Basics

Review of Optimization Basics Review of Optimization Basics. Introduction Electricity markets throughout the US are said to have a two-settlement structure. The reason for this is that the structure includes two different markets:

More information

Lecture Note 5: Semidefinite Programming for Stability Analysis

Lecture Note 5: Semidefinite Programming for Stability Analysis ECE7850: Hybrid Systems:Theory and Applications Lecture Note 5: Semidefinite Programming for Stability Analysis Wei Zhang Assistant Professor Department of Electrical and Computer Engineering Ohio State

More information

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

More information

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009

LMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009 LMI MODELLING 4. CONVEX LMI MODELLING Didier HENRION LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ Universidad de Valladolid, SP March 2009 Minors A minor of a matrix F is the determinant of a submatrix

More information

Linear Algebra & Geometry why is linear algebra useful in computer vision?

Linear Algebra & Geometry why is linear algebra useful in computer vision? Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia

More information

Convex envelopes, cardinality constrained optimization and LASSO. An application in supervised learning: support vector machines (SVMs)

Convex envelopes, cardinality constrained optimization and LASSO. An application in supervised learning: support vector machines (SVMs) ORF 523 Lecture 8 Princeton University Instructor: A.A. Ahmadi Scribe: G. Hall Any typos should be emailed to a a a@princeton.edu. 1 Outline Convexity-preserving operations Convex envelopes, cardinality

More information

Convex Optimization M2

Convex Optimization M2 Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization

More information

Least Squares Optimization

Least Squares Optimization Least Squares Optimization The following is a brief review of least squares optimization and constrained optimization techniques, which are widely used to analyze and visualize data. Least squares (LS)

More information

Lecture 14: Newton s Method

Lecture 14: Newton s Method 10-725/36-725: Conve Optimization Fall 2016 Lecturer: Javier Pena Lecture 14: Newton s ethod Scribes: Varun Joshi, Xuan Li Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes

More information

Linear Algebra Review

Linear Algebra Review Linear Algebra Review ORIE 4741 September 1, 2017 Linear Algebra Review September 1, 2017 1 / 33 Outline 1 Linear Independence and Dependence 2 Matrix Rank 3 Invertible Matrices 4 Norms 5 Projection Matrix

More information

Gradient Descent. Ryan Tibshirani Convex Optimization /36-725

Gradient Descent. Ryan Tibshirani Convex Optimization /36-725 Gradient Descent Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: canonical convex programs Linear program (LP): takes the form min x subject to c T x Gx h Ax = b Quadratic program (QP): like

More information

Lecture 7: Positive Semidefinite Matrices

Lecture 7: Positive Semidefinite Matrices Lecture 7: Positive Semidefinite Matrices Rajat Mittal IIT Kanpur The main aim of this lecture note is to prepare your background for semidefinite programming. We have already seen some linear algebra.

More information