Convexity II: Optimization Basics
|
|
- Ethelbert Burns
- 5 years ago
- Views:
Transcription
1 Conveity II: Optimization Basics Lecturer: Ryan Tibshirani Conve Optimization / See supplements for reviews of basic multivariate calculus basic linear algebra
2 Last time: conve sets and functions Conve calculus makes it easy to check conveity. Tools: Definitions of conve sets and functions, classic eamples (, f()) (y, f(y)) Key properties (e.g., first- and second-order characterizations for functions) Operations that preserve conveity (e.g., affine composition) { ( E.g., is ma log 1 (a T + b) 7 ) }, A + b 5 1 conve? 2
3 Outline Today: Optimization terology Properties and first-order optimality Equivalent transformations 3
4 Optimization terology Reder: a conve optimization problem (or program) is D subject to f() g i () 0, i = 1,... m A = b where f and g i, i = 1,... m are all conve, and the optimization domain is D = dom(f) m i=1 dom(g i) (often we do not write D) f is called criterion or objective function g i is called inequality constraint function If D, g i () 0, i = 1,... m, and A = b then is called a feasible point The imum of f() over all feasible points is called the optimal value, written f 4
5 If is feasible and f() = f, then is called optimal; also called a solution, or a imizer 1 If is feasible and f() f + ɛ, then is called ɛ-suboptimal If is feasible and g i () = 0, then we say g i is active at Conve imization can be reposed as concave maimization f() subject to g i () 0, i = 1,... m A = b ma f() subject to g i () 0, i = 1,... m A = b Both are called conve optimization problems 1 Note: a conve optimization problem need not have solutions, i.e., need not attain its imum, but we will not be careful about this 5
6 Conve solution sets Let X opt be the set of all solutions of conve problem, written X opt = arg f() subject to Key property: X opt is a conve set g i () 0, i = 1,... m A = b Proof: use definitions. If, y are solutions, then for 0 t 1, t + (1 t)y D g i (t + (1 t)y) tg i () + (1 t)g i (y) 0 A(t + (1 t)y) = ta + (1 t)ay = b f(t + (1 t)y) tf() + (1 t)f(y) = f Therefore t + (1 t)y is also a solution Another key property: if f is strictly conve, then the solution is unique, i.e., X opt contains one element 6
7 Eample: lasso Given y R n, X R n p, consider the lasso problem: β y Xβ 2 2 subject to β 1 s Is this conve? What is the criterion function? The inequality and equality constraints? Feasible set? Is the solution unique, when: n p and X has full column rank? p > n ( high-dimensional case)? How do our answers change if we changed criterion to Huber loss: { n 1 ρ(y i T i β), ρ(z) = 2 z2 z δ δ z 1 2 δ2 else i=1? 7
8 Eample: support vector machines Given y { 1, 1} n, X R n p with rows 1,... n, consider the support vector machine or SVM problem: β,β 0,ξ subject to 1 2 β C n i=1 ξ i ξ i 0, i = 1,... n y i ( T i β + β 0 ) 1 ξ i, i = 1,... n Is this conve? What is the criterion, constraints, feasible set? Is the solution (β, β 0, ξ) unique? What if changed the criterion to 1 2 β β2 0 + C n i=1 ξ 1.01 i? For original criterion, what about β component, at the solution? 8
9 Local ima are global ima For a conve problem, a feasible point is called locally optimal is there is some R > 0 such that f() f(y) for all feasible y such that y 2 R Reder: for conve optimization problems, local optima are global optima Proof simply follows from definitions Conve Nonconve 9
10 10 The optimization problem Rewriting constraints subject to f() g i () 0, i = 1,... m A = b can be rewritten as f() subject to C where C = { : g i () 0, i = 1,... m, A = b}, the feasible set. Hence the above formulation is completely general With I C the indicator of C, we can write this in unconstrained form f() + I C ()
11 11 For a conve problem First-order optimality condition f() subject to C and differentiable f, a feasible point is optimal if and only if f() T (y ) 0 for all y C This is called the first-order condition for optimality In words: all feasible directions from are aligned with gradient f() Important special case: if C = R n (unconstrained optimization), then optimality condition reduces to familiar f() = 0
12 12 Eample: quadratic imization Consider imizing the quadratic function f() = 1 2 T Q + b T + c where Q 0. The first-order condition says that solution satisfies Cases: f() = Q + b = 0 if Q 0, then there is a unique solution = Q 1 b if Q is singular and b / col(q), then there is no solution (i.e., f() = ) if Q is singular and b col(q), then there are infinitely many solutions = Q + b + z, z null(q) where Q + is the pseudoinverse of Q
13 13 Eample: equality-constrained imization Consider the equality-constrained conve problem: f() subject to A = b with f differentiable. Let s prove Lagrange multiplier optimality condition f() + A T u = 0 for some u According to first-order optimality, solution satisfies A = b and f() T (y ) 0 for all y such that Ay = b This is equivalent to f() T v = 0 for all v null(a) Result follows because null(a) = row(a)
14 14 Eample: projection onto a conve set Consider projection onto conve set C: a 2 2 subject to C First-order optimality condition says that the solution satisfies f() T (y ) = ( a) T (y ) 0 for all y C Equivalently, this says that a N C () where recall N C () is the normal cone to C at
15 15 Partial optimization Reder: g() = y C f(, y) is conve in, provided that f is conve in (, y) and C is a conve set Therefore we can always partially optimize a conve problem and retain conveity E.g., if we decompose = ( 1, 2 ) R n 1+n 2, then f( 1, 2 ) 1, 2 subject to g 1 ( 1 ) 0 g 2 ( 2 ) 0 1 f(1 ) subject to g 1 ( 1 ) 0 where f( 1 ) = {f( 1, 2 ) : g 2 ( 2 ) 0}. The right problem is conve if the left problem is
16 16 Recall the SVM problem Eample: hinge form of SVMs β,β 0,ξ subject to 1 2 β C n i=1 ξ i ξ i 0, y i ( T i β + β 0 ) 1 ξ i, i = 1,... n Rewrite the constraints as ξ i ma{0, 1 y i ( T i β + β 0)}. Indeed we can argue that we have = at solution Therefore plugging in for optimal ξ gives the hinge form of SVMs: 1 β,β 0 2 β C n [ 1 yi ( T i β + β 0 ) ] + i=1 where a + = ma{0, a} is called the hinge function
17 17 Transformations and change of variables If h : R R is a monotone increasing transformation, then f() subject to C h(f()) subject to C Similarly, inequality or equality constraints can be transformed and yield equivalent optimization problems. Can use this to reveal the hidden conveity of a problem If φ : R n R m is one-to-one, and its image covers feasible set C, then we can change variables in an optimization problem: y f() subject to C f(φ(y)) subject to φ(y) C
18 Eample: geometric programg A monomial is a function f : R n ++ R of the form f() = γ a 1 1 a 2 2 an n for γ > 0, a 1,... a n R. A posynomial is a sum of monomials, f() = p k=1 A geometric program is of the form γ k a k1 1 a k2 2 a kn n subject to f() g i () 1, i = 1,... m h j () = 1, j = 1,... r where f, g i, i = 1,... m are posynomials and h j, j = 1,... r are monomials. This is nonconve 18
19 19 Let s prove that a geometric program is equivalent to a conve one. Given f() = γ a 1 1 a 2 2 an n, let y i = log i and rewrite this as γ(e y 1 ) a 1 (e y 2 ) a2 (e yn ) an = e at y+b for b = log γ. Also, a posynomial can be written as p k=1 eat k y+b k. With this variable substitution, and after taking logs, a geometric program is equivalent to ) subject to log log ( p0 e at 0k y+b 0k k=1 ( pi e at ik y+b ik k=1 ) c T j y + d j = 0, j = 1,... r 0, i = 1,... m This is conve, recalling the conveity of soft ma functions
20 20 Many interesting problems are geometric programs, e.g., floor planning: w i ( i,y i ) C i h i H W See Boyd et al. (2007), A tutorial on geometric programg, and also Chapter 8.8 of B & V book
21 21 Eample floor planning program: W,H,,y,w,h subject to W H 0 i W, i = 1,... n 0 y i H, i = 1,... n i + w i j, (i, j) L y i + h i y j, (i, j) B w i h i = C i, i = 1,... n. Check: why is this a geometric program?
22 22 Eliating equality constraints Important special case of change of variables: eliating equality constraints. Given the problem subject to f() g i () 0, i = 1,... m A = b we can always epress any feasible point as = My + 0, where A 0 = b and col(m) = null(a). Hence the above is equivalent to y f(my + 0 ) subject to g i (My + 0 ) 0, i = 1,... m Note: this is fully general but not always a good idea (practically)
23 23 Introducing slack variables Essentially opposite to eliating equality contraints: introducing slack variables. Given the problem subject to f() g i () 0, i = 1,... m A = b we can transform the inequality constraints via,s subject to f() s i 0, i = 1,... m g i () + s i = 0, i = 1,... m A = b Note: this is no longer conve unless g i, i = 1,..., n are affine
24 24 Relaing nonaffine equality constraints Given an optimization problem f() subject to C we can always take an enlarged constraint set C C and consider f() subject to C This is called a relaation and its optimal value is always smaller or equal to that of the original problem Important special case: relaing nonaffine equality constraints, i.e., h j () = 0, j = 1,... r where h j, j = 1,... r are conve but nonaffine, are replaced with h j () 0, j = 1,... r
25 25 Eample: maimum utility problem The maimum utility problem models investment/consumption: ma,b subject to T α t u( t ) t=1 b t+1 = b t + f(b t ) t, t = 1,... T 0 t b t, t = 1,... T Here b t is the budget and t is the amount consumed at time t; f is an investment return function, u utility function, both concave and increasing Is this a conve problem? What if we replace equality constraints with inequalities: b t+1 b t + f(b t ) t, t = 1,... T?
26 26 Eample: principal components analysis Given X R n p, consider the low rank approimation problem: R X R 2 F subject to rank(r) = k Here A 2 F = n i=1 p j=1 A2 ij, the entrywise squared l 2 norm, and rank(a) denotes the rank of A Also called principal components analysis or PCA problem. Given X = UDV T, singular value decomposition or SVD, the solution is R = U k D k V T k where U k, V k are the first k columns of U, V and D k is the first k diagonal elements of D. I.e., R is reconstruction of X from its first k principal components This problem is not conve. Why?
27 27 We can recast the PCA problem in a conve form. First rewrite as X Z S XZ 2 p F subject to rank(z) = k, Z is a projection ma Z S p tr(sz) subject to rank(z) = k, Z is a projection where S = X T X. Hence constraint set is the nonconve set { C = Z S p : λ i (Z) {0, 1}, i = 1,... p, tr(z) = k} where λ i (Z), i = 1,... n are the eigenvalues of Z. Solution in this formulation is Z = V k V T k where V k gives first k columns of V
28 28 Now consider relaing constraint set to F k = conv(c), its conve hull. Note F k = {Z S p : λ i (Z) [0, 1], i = 1,... p, tr(z) = k} = {Z S p : 0 Z I, tr(z) = k} Recall this is called the Fantope of order k Hence, the linear maimization over the Fantope, namely ma Z F k tr(sz) is conve. Remarkably, this is equivalent to the nonconve PCA problem (admits the same solution)! (Famous result: Fan (1949), On a theorem of Weyl conerning eigenvalues of linear transformations )
29 29 References and further reading S. Boyd and L. Vandenberghe (2004), Conve optimization, Chapter 4 O. Guler (2010), Foundations of optimization, Chapter 4
Lecture 4: September 12
10-725/36-725: Conve Optimization Fall 2016 Lecture 4: September 12 Lecturer: Ryan Tibshirani Scribes: Jay Hennig, Yifeng Tao, Sriram Vasudevan Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:
More informationLecture 1: January 12
10-725/36-725: Convex Optimization Fall 2015 Lecturer: Ryan Tibshirani Lecture 1: January 12 Scribes: Seo-Jin Bang, Prabhat KC, Josue Orellana 1.1 Review We begin by going through some examples and key
More informationDuality Uses and Correspondences. Ryan Tibshirani Convex Optimization
Duality Uses and Correspondences Ryan Tibshirani Conve Optimization 10-725 Recall that for the problem Last time: KKT conditions subject to f() h i () 0, i = 1,... m l j () = 0, j = 1,... r the KKT conditions
More informationLecture 26: April 22nd
10-725/36-725: Conve Optimization Spring 2015 Lecture 26: April 22nd Lecturer: Ryan Tibshirani Scribes: Eric Wong, Jerzy Wieczorek, Pengcheng Zhou Note: LaTeX template courtesy of UC Berkeley EECS dept.
More informationLecture 23: November 19
10-725/36-725: Conve Optimization Fall 2018 Lecturer: Ryan Tibshirani Lecture 23: November 19 Scribes: Charvi Rastogi, George Stoica, Shuo Li Charvi Rastogi: 23.1-23.4.2, George Stoica: 23.4.3-23.8, Shuo
More informationLECTURE 7. Least Squares and Variants. Optimization Models EE 127 / EE 227AT. Outline. Least Squares. Notes. Notes. Notes. Notes.
Optimization Models EE 127 / EE 227AT Laurent El Ghaoui EECS department UC Berkeley Spring 2015 Sp 15 1 / 23 LECTURE 7 Least Squares and Variants If others would but reflect on mathematical truths as deeply
More informationLecture 10. ( x domf. Remark 1 The domain of the conjugate function is given by
10-1 Multi-User Information Theory Jan 17, 2012 Lecture 10 Lecturer:Dr. Haim Permuter Scribe: Wasim Huleihel I. CONJUGATE FUNCTION In the previous lectures, we discussed about conve set, conve functions
More informationNonconvex? NP! (No Problem!) Ryan Tibshirani Convex Optimization /36-725
Nonconvex? NP! (No Problem!) Ryan Tibshirani Convex Optimization 10-725/36-725 1 Beyond the tip? 2 Some takeaway points If possible, formulate task in terms of convex optimization typically easier to solve,
More informationCanonical Problem Forms. Ryan Tibshirani Convex Optimization
Canonical Problem Forms Ryan Tibshirani Convex Optimization 10-725 Last time: optimization basics Optimization terology (e.g., criterion, constraints, feasible points, solutions) Properties and first-order
More informationLecture 23: Conditional Gradient Method
10-725/36-725: Conve Optimization Spring 2015 Lecture 23: Conditional Gradient Method Lecturer: Ryan Tibshirani Scribes: Shichao Yang,Diyi Yang,Zhanpeng Fang Note: LaTeX template courtesy of UC Berkeley
More informationNonconvex? NP! (No Problem!) Ryan Tibshirani Convex Optimization
Nonconvex? NP! (No Problem!) Ryan Tibshirani Convex Optimization 10-725 1 Outline Today: Convex versus nonconvex? Classical nonconvex problems Eigen problems Graph problems Nonconvex proximal operators
More informationCS-E4830 Kernel Methods in Machine Learning
CS-E4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This
More informationKarush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725
Karush-Kuhn-Tucker Conditions Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given a minimization problem Last time: duality min x subject to f(x) h i (x) 0, i = 1,... m l j (x) = 0, j =
More informationIntro to Nonlinear Optimization
Intro to Nonlinear Optimization We now rela the proportionality and additivity assumptions of LP What are the challenges of nonlinear programs NLP s? Objectives and constraints can use any function: ma
More information10701 Recitation 5 Duality and SVM. Ahmed Hefny
10701 Recitation 5 Duality and SVM Ahmed Hefny Outline Langrangian and Duality The Lagrangian Duality Eamples Support Vector Machines Primal Formulation Dual Formulation Soft Margin and Hinge Loss Lagrangian
More informationIntroduction to Machine Learning Spring 2018 Note Duality. 1.1 Primal and Dual Problem
CS 189 Introduction to Machine Learning Spring 2018 Note 22 1 Duality As we have seen in our discussion of kernels, ridge regression can be viewed in two ways: (1) an optimization problem over the weights
More informationICS-E4030 Kernel Methods in Machine Learning
ICS-E4030 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 28. September, 2016 Juho Rousu 28. September, 2016 1 / 38 Convex optimization Convex optimisation This
More informationProximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725
Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:
More informationLagrangian Duality for Dummies
Lagrangian Duality for Dummies David Knowles November 13, 2010 We want to solve the following optimisation problem: f 0 () (1) such that f i () 0 i 1,..., m (2) For now we do not need to assume conveity.
More informationThe Lagrangian L : R d R m R r R is an (easier to optimize) lower bound on the original problem:
HT05: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford Convex Optimization and slides based on Arthur Gretton s Advanced Topics in Machine Learning course
More information10-725/36-725: Convex Optimization Spring Lecture 21: April 6
10-725/36-725: Conve Optimization Spring 2015 Lecturer: Ryan Tibshirani Lecture 21: April 6 Scribes: Chiqun Zhang, Hanqi Cheng, Waleed Ammar Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:
More information1 Kernel methods & optimization
Machine Learning Class Notes 9-26-13 Prof. David Sontag 1 Kernel methods & optimization One eample of a kernel that is frequently used in practice and which allows for highly non-linear discriminant functions
More informationNonconvex? NP! (No Problem!) Lecturer: Ryan Tibshirani Convex Optimization /36-725
Nonconvex? NP! (No Problem!) Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: integer programg Given convex function f, convex set C, J {1,... n}, an integer program is a problem
More informationLecture 7: Weak Duality
EE 227A: Conve Optimization and Applications February 7, 2012 Lecture 7: Weak Duality Lecturer: Laurent El Ghaoui 7.1 Lagrange Dual problem 7.1.1 Primal problem In this section, we consider a possibly
More informationDuality in Linear Programs. Lecturer: Ryan Tibshirani Convex Optimization /36-725
Duality in Linear Programs Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: proximal gradient descent Consider the problem x g(x) + h(x) with g, h convex, g differentiable, and
More informationLecture 7: September 17
10-725: Optimization Fall 2013 Lecture 7: September 17 Lecturer: Ryan Tibshirani Scribes: Serim Park,Yiming Gu 7.1 Recap. The drawbacks of Gradient Methods are: (1) requires f is differentiable; (2) relatively
More information9. Interpretations, Lifting, SOS and Moments
9-1 Interpretations, Lifting, SOS and Moments P. Parrilo and S. Lall, CDC 2003 2003.12.07.04 9. Interpretations, Lifting, SOS and Moments Polynomial nonnegativity Sum of squares (SOS) decomposition Eample
More informationConvex Optimization. 4. Convex Optimization Problems. Prof. Ying Cui. Department of Electrical Engineering Shanghai Jiao Tong University
Conve Optimization 4. Conve Optimization Problems Prof. Ying Cui Department of Electrical Engineering Shanghai Jiao Tong University 2017 Autumn Semester SJTU Ying Cui 1 / 58 Outline Optimization problems
More informationLecture 10: Duality in Linear Programs
10-725/36-725: Convex Optimization Spring 2015 Lecture 10: Duality in Linear Programs Lecturer: Ryan Tibshirani Scribes: Jingkun Gao and Ying Zhang Disclaimer: These notes have not been subjected to the
More informationLecture 15 Newton Method and Self-Concordance. October 23, 2008
Newton Method and Self-Concordance October 23, 2008 Outline Lecture 15 Self-concordance Notion Self-concordant Functions Operations Preserving Self-concordance Properties of Self-concordant Functions Implications
More informationIntroduction to Alternating Direction Method of Multipliers
Introduction to Alternating Direction Method of Multipliers Yale Chang Machine Learning Group Meeting September 29, 2016 Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction
More informationLagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)
Lagrange Duality Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline of Lecture Lagrangian Dual function Dual
More informationProperties of Matrices and Operations on Matrices
Properties of Matrices and Operations on Matrices A common data structure for statistical analysis is a rectangular array or matris. Rows represent individual observational units, or just observations,
More informationFrank-Wolfe Method. Ryan Tibshirani Convex Optimization
Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)
More informationDuality revisited. Javier Peña Convex Optimization /36-725
Duality revisited Javier Peña Conve Optimization 10-725/36-725 1 Last time: barrier method Main idea: approimate the problem f() + I C () with the barrier problem f() + 1 t φ() tf() + φ() where t > 0 and
More informationConvex Optimization Overview (cnt d)
Conve Optimization Overview (cnt d) Chuong B. Do November 29, 2009 During last week s section, we began our study of conve optimization, the study of mathematical optimization problems of the form, minimize
More informationSubgradient Method. Ryan Tibshirani Convex Optimization
Subgradient Method Ryan Tibshirani Convex Optimization 10-725 Consider the problem Last last time: gradient descent min x f(x) for f convex and differentiable, dom(f) = R n. Gradient descent: choose initial
More informationLecture 16: October 22
0-725/36-725: Conve Optimization Fall 208 Lecturer: Ryan Tibshirani Lecture 6: October 22 Scribes: Nic Dalmasso, Alan Mishler, Benja LeRoy Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:
More informationDual Methods. Lecturer: Ryan Tibshirani Convex Optimization /36-725
Dual Methods Lecturer: Ryan Tibshirani Conve Optimization 10-725/36-725 1 Last time: proimal Newton method Consider the problem min g() + h() where g, h are conve, g is twice differentiable, and h is simple.
More informationGI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis. Massimiliano Pontil
GI07/COMPM012: Mathematical Programming and Research Methods (Part 2) 2. Least Squares and Principal Components Analysis Massimiliano Pontil 1 Today s plan SVD and principal component analysis (PCA) Connection
More informationSTATIC LECTURE 4: CONSTRAINED OPTIMIZATION II - KUHN TUCKER THEORY
STATIC LECTURE 4: CONSTRAINED OPTIMIZATION II - KUHN TUCKER THEORY UNIVERSITY OF MARYLAND: ECON 600 1. Some Eamples 1 A general problem that arises countless times in economics takes the form: (Verbally):
More information4. Convex optimization problems
Convex Optimization Boyd & Vandenberghe 4. Convex optimization problems optimization problem in standard form convex optimization problems quasiconvex optimization linear optimization quadratic optimization
More informationDual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725
Dual methods and ADMM Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given f : R n R, the function is called its conjugate Recall conjugate functions f (y) = max x R n yt x f(x)
More informationGeometric Modeling Summer Semester 2010 Mathematical Tools (1)
Geometric Modeling Summer Semester 2010 Mathematical Tools (1) Recap: Linear Algebra Today... Topics: Mathematical Background Linear algebra Analysis & differential geometry Numerical techniques Geometric
More informationStatistical Geometry Processing Winter Semester 2011/2012
Statistical Geometry Processing Winter Semester 2011/2012 Linear Algebra, Function Spaces & Inverse Problems Vector and Function Spaces 3 Vectors vectors are arrows in space classically: 2 or 3 dim. Euclidian
More informationConditional Gradient (Frank-Wolfe) Method
Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties
More informationMatrix Vector Products
We covered these notes in the tutorial sessions I strongly recommend that you further read the presented materials in classical books on linear algebra Please make sure that you understand the proofs and
More informationLecture: Convex Optimization Problems
1/36 Lecture: Convex Optimization Problems http://bicmr.pku.edu.cn/~wenzw/opt-2015-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghe s lecture notes Introduction 2/36 optimization
More informationLecture Notes on Support Vector Machine
Lecture Notes on Support Vector Machine Feng Li fli@sdu.edu.cn Shandong University, China 1 Hyperplane and Margin In a n-dimensional space, a hyper plane is defined by ω T x + b = 0 (1) where ω R n is
More informationConvex Optimization and Modeling
Convex Optimization and Modeling Introduction and a quick repetition of analysis/linear algebra First lecture, 12.04.2010 Jun.-Prof. Matthias Hein Organization of the lecture Advanced course, 2+2 hours,
More informationConvex Optimization Problems. Prof. Daniel P. Palomar
Conve Optimization Problems Prof. Daniel P. Palomar The Hong Kong University of Science and Technology (HKUST) MAFS6010R- Portfolio Optimization with R MSc in Financial Mathematics Fall 2018-19, HKUST,
More informationConvex Functions and Optimization
Chapter 5 Convex Functions and Optimization 5.1 Convex Functions Our next topic is that of convex functions. Again, we will concentrate on the context of a map f : R n R although the situation can be generalized
More informationOn Convergence Rate of Concave-Convex Procedure
On Convergence Rate of Concave-Conve Procedure Ian E.H. Yen r00922017@csie.ntu.edu.tw Po-Wei Wang b97058@csie.ntu.edu.tw Nanyun Peng Johns Hopkins University Baltimore, MD 21218 npeng1@jhu.edu Shou-De
More informationLecture 8: February 9
0-725/36-725: Convex Optimiation Spring 205 Lecturer: Ryan Tibshirani Lecture 8: February 9 Scribes: Kartikeya Bhardwaj, Sangwon Hyun, Irina Caan 8 Proximal Gradient Descent In the previous lecture, we
More informationUses of duality. Geoff Gordon & Ryan Tibshirani Optimization /
Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear
More information10-725/36-725: Convex Optimization Prerequisite Topics
10-725/36-725: Convex Optimization Prerequisite Topics February 3, 2015 This is meant to be a brief, informal refresher of some topics that will form building blocks in this course. The content of the
More informationConvex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014
Convex Optimization Dani Yogatama School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA February 12, 2014 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12,
More informationExistence of minimizers
Existence of imizers We have just talked a lot about how to find the imizer of an unconstrained convex optimization problem. We have not talked too much, at least not in concrete mathematical terms, about
More informationExtreme Abridgment of Boyd and Vandenberghe s Convex Optimization
Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The
More informationMath 273a: Optimization Subgradients of convex functions
Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 42 Subgradients Assumptions
More informationLecture 13: Duality Uses and Correspondences
10-725/36-725: Conve Optimization Fall 2016 Lectre 13: Dality Uses and Correspondences Lectrer: Ryan Tibshirani Scribes: Yichong X, Yany Liang, Yanning Li Note: LaTeX template cortesy of UC Berkeley EECS
More informationMA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2
MA 575 Linear Models: Cedric E. Ginestet, Boston University Regularization: Ridge Regression and Lasso Week 14, Lecture 2 1 Ridge Regression Ridge regression and the Lasso are two forms of regularized
More informationLeast Squares Optimization
Least Squares Optimization The following is a brief review of least squares optimization and constrained optimization techniques. I assume the reader is familiar with basic linear algebra, including the
More informationDuality. Geoff Gordon & Ryan Tibshirani Optimization /
Duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Duality in linear programs Suppose we want to find lower bound on the optimal value in our convex problem, B min x C f(x) E.g., consider
More informationA GENERAL FORMULATION FOR SUPPORT VECTOR MACHINES. Wei Chu, S. Sathiya Keerthi, Chong Jin Ong
A GENERAL FORMULATION FOR SUPPORT VECTOR MACHINES Wei Chu, S. Sathiya Keerthi, Chong Jin Ong Control Division, Department of Mechanical Engineering, National University of Singapore 0 Kent Ridge Crescent,
More informationCourse Notes for EE227C (Spring 2018): Convex Optimization and Approximation
Course otes for EE7C (Spring 018): Conve Optimization and Approimation Instructor: Moritz Hardt Email: hardt+ee7c@berkeley.edu Graduate Instructor: Ma Simchowitz Email: msimchow+ee7c@berkeley.edu October
More informationConvex optimization problems. Optimization problem in standard form
Convex optimization problems optimization problem in standard form convex optimization problems linear optimization quadratic optimization geometric programming quasiconvex optimization generalized inequality
More informationMidterm 1 Solutions. 1. (2 points) Show that the Frobenius norm of a matrix A depends only on its singular values. Precisely, show that
EE127A L. El Ghaoui YOUR NAME HERE: SOLUTIONS YOUR SID HERE: 42 3/27/9 Midterm 1 Solutions The eam is open notes, but access to the Internet is not allowed. The maimum grade is 2. When asked to prove something,
More informationLeast Squares Optimization
Least Squares Optimization The following is a brief review of least squares optimization and constrained optimization techniques. Broadly, these techniques can be used in data analysis and visualization
More informationEE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17
EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 17 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory May 29, 2012 Andre Tkacenko
More informationDual Ascent. Ryan Tibshirani Convex Optimization
Dual Ascent Ryan Tibshirani Conve Optimization 10-725 Last time: coordinate descent Consider the problem min f() where f() = g() + n i=1 h i( i ), with g conve and differentiable and each h i conve. Coordinate
More informationSupport Vector Machines
Support Vector Machines Support vector machines (SVMs) are one of the central concepts in all of machine learning. They are simply a combination of two ideas: linear classification via maximum (or optimal
More informationConstrained Optimization and Lagrangian Duality
CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may
More informationIn applications, we encounter many constrained optimization problems. Examples Basis pursuit: exact sparse recovery problem
1 Conve Analsis Main references: Vandenberghe UCLA): EECS236C - Optimiation methods for large scale sstems, http://www.seas.ucla.edu/ vandenbe/ee236c.html Parikh and Bod, Proimal algorithms, slides and
More informationELEG5481 SIGNAL PROCESSING OPTIMIZATION TECHNIQUES 6. GEOMETRIC PROGRAM
ELEG5481 SIGNAL PROCESSING OPTIMIZATION TECHNIQUES 6. GEOMETRIC PROGRAM Wing-Kin Ma, Dept. Electronic Eng., The Chinese University of Hong Kong 1 Some Basics A function f : R n R with domf = R n ++, defined
More informationReview of Linear Algebra
Review of Linear Algebra Dr Gerhard Roth COMP 40A Winter 05 Version Linear algebra Is an important area of mathematics It is the basis of computer vision Is very widely taught, and there are many resources
More informationLecture 1 Systems of Linear Equations and Matrices
Lecture 1 Systems of Linear Equations and Matrices Math 19620 Outline of Course Linear Equations and Matrices Linear Transformations, Inverses Bases, Linear Independence, Subspaces Abstract Vector Spaces
More informationHomework 4. Convex Optimization /36-725
Homework 4 Convex Optimization 10-725/36-725 Due Friday November 4 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)
More informationOptimization Methods: Optimization using Calculus - Equality constraints 1. Module 2 Lecture Notes 4
Optimization Methods: Optimization using Calculus - Equality constraints Module Lecture Notes 4 Optimization of Functions of Multiple Variables subect to Equality Constraints Introduction In the previous
More informationNewton s Method. Ryan Tibshirani Convex Optimization /36-725
Newton s Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, Properties and examples: f (y) = max x
More informationLinear, threshold units. Linear Discriminant Functions and Support Vector Machines. Biometrics CSE 190 Lecture 11. X i : inputs W i : weights
Linear Discriminant Functions and Support Vector Machines Linear, threshold units CSE19, Winter 11 Biometrics CSE 19 Lecture 11 1 X i : inputs W i : weights θ : threshold 3 4 5 1 6 7 Courtesy of University
More informationLecture: Duality of LP, SOCP and SDP
1/33 Lecture: Duality of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:
More information1. Sets A set is any collection of elements. Examples: - the set of even numbers between zero and the set of colors on the national flag.
San Francisco State University Math Review Notes Michael Bar Sets A set is any collection of elements Eamples: a A {,,4,6,8,} - the set of even numbers between zero and b B { red, white, bule} - the set
More informationB553 Lecture 5: Matrix Algebra Review
B553 Lecture 5: Matrix Algebra Review Kris Hauser January 19, 2012 We have seen in prior lectures how vectors represent points in R n and gradients of functions. Matrices represent linear transformations
More informationLecture 4: Convex Functions, Part I February 1
IE 521: Convex Optimization Instructor: Niao He Lecture 4: Convex Functions, Part I February 1 Spring 2017, UIUC Scribe: Shuanglong Wang Courtesy warning: These notes do not necessarily cover everything
More informationLearning the Kernel Matrix with Semi-Definite Programming
Learning the Kernel Matrix with Semi-Definite Programg Gert R.G. Lanckriet gert@cs.berkeley.edu Department of Electrical Engineering and Computer Science University of California, Berkeley, CA 94720, USA
More informationLinear Algebra & Geometry why is linear algebra useful in computer vision?
Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia
More informationIntroduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research
Introduction to Machine Learning Lecture 7 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Convex Optimization Differentiation Definition: let f : X R N R be a differentiable function,
More informationReview of Optimization Basics
Review of Optimization Basics. Introduction Electricity markets throughout the US are said to have a two-settlement structure. The reason for this is that the structure includes two different markets:
More informationLecture Note 5: Semidefinite Programming for Stability Analysis
ECE7850: Hybrid Systems:Theory and Applications Lecture Note 5: Semidefinite Programming for Stability Analysis Wei Zhang Assistant Professor Department of Electrical and Computer Engineering Ohio State
More informationLecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem
Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R
More informationLMI MODELLING 4. CONVEX LMI MODELLING. Didier HENRION. LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ. Universidad de Valladolid, SP March 2009
LMI MODELLING 4. CONVEX LMI MODELLING Didier HENRION LAAS-CNRS Toulouse, FR Czech Tech Univ Prague, CZ Universidad de Valladolid, SP March 2009 Minors A minor of a matrix F is the determinant of a submatrix
More informationLinear Algebra & Geometry why is linear algebra useful in computer vision?
Linear Algebra & Geometry why is linear algebra useful in computer vision? References: -Any book on linear algebra! -[HZ] chapters 2, 4 Some of the slides in this lecture are courtesy to Prof. Octavia
More informationConvex envelopes, cardinality constrained optimization and LASSO. An application in supervised learning: support vector machines (SVMs)
ORF 523 Lecture 8 Princeton University Instructor: A.A. Ahmadi Scribe: G. Hall Any typos should be emailed to a a a@princeton.edu. 1 Outline Convexity-preserving operations Convex envelopes, cardinality
More informationConvex Optimization M2
Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization
More informationLeast Squares Optimization
Least Squares Optimization The following is a brief review of least squares optimization and constrained optimization techniques, which are widely used to analyze and visualize data. Least squares (LS)
More informationLecture 14: Newton s Method
10-725/36-725: Conve Optimization Fall 2016 Lecturer: Javier Pena Lecture 14: Newton s ethod Scribes: Varun Joshi, Xuan Li Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes
More informationLinear Algebra Review
Linear Algebra Review ORIE 4741 September 1, 2017 Linear Algebra Review September 1, 2017 1 / 33 Outline 1 Linear Independence and Dependence 2 Matrix Rank 3 Invertible Matrices 4 Norms 5 Projection Matrix
More informationGradient Descent. Ryan Tibshirani Convex Optimization /36-725
Gradient Descent Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: canonical convex programs Linear program (LP): takes the form min x subject to c T x Gx h Ax = b Quadratic program (QP): like
More informationLecture 7: Positive Semidefinite Matrices
Lecture 7: Positive Semidefinite Matrices Rajat Mittal IIT Kanpur The main aim of this lecture note is to prepare your background for semidefinite programming. We have already seen some linear algebra.
More information