Convex Optimization: Applications

Size: px
Start display at page:

Download "Convex Optimization: Applications"

Transcription

1 Convex Optimization: Applications Lecturer: Pradeep Ravikumar Co-instructor: Aarti Singh Convex Optimization 1-75/36-75 Based on material from Boyd, Vandenberghe

2 Norm Approximation minimize Ax b (A R m n with m n, is a norm on R m ) interpretations of solution x = argmin x Ax b : geometric: Ax is point in R(A) closest to b estimation: linearmeasurementmodel y = Ax + v y are measurements, x is unknown, v is measurement error given y = b, best guess of x is x optimal design: x are design variables (input), Ax is result (output) x is design that best approximates desired result b

3 Norm Approximation: ell_ norm minimize Ax b (A R m n with m n, is a norm on R m ) least-squares approximation ( ): solution satisfies normal equations A T Ax = A T b (x =(A T A) 1 A T b if rank A = n)

4 Norm Approximation: ell_infty norm minimize Ax b (A R m n with m n, is a norm on R m ) Chebyshev approximation ( ): can be solved as an LP minimize subject to t t1 Ax b t1

5 Norm Approximation: ell_1 norm minimize Ax b (A R m n with m n, is a norm on R m ) sum of absolute residuals approximation ( 1 ): can be solved as an LP minimize subject to 1 T y y Ax b y

6 Penalty Function Approximation minimize φ(r 1 )+ + φ(r m ) subject to r = Ax b (A R m n, φ : R R is a convex penalty function) examples quadratic: φ(u) =u deadzone-linear with width a: 1.5 log barrier quadratic φ(u) = max{, u a} φ(u) 1.5 deadzone-linear log-barrier with limit a: φ(u) = { a log(1 (u/a) ) u <a otherwise u

7 Penalty Function Approximation Huber penalty function (with parameter M) φ hub (u) = { u u M M( u M) u >M linear growth for large u makes approximation less sensitive to outliers φhub(u) f(t) u t left: Huber penalty for M =1 right: affine function f(t) =α + βt fitted to 4 points t i, y i (circles) using quadratic (dashed) and Huber (solid) penalty

8 Least Norm Problems minimize subject to x Ax = b (A R m n with m n, is a norm on R n ) interpretations of solution x = argmin Ax=b x : geometric: x is point in affine set {x Ax = b} with minimum distance to estimation: b = Ax are (perfect) measurements of x; x is smallest ( most plausible ) estimate consistent with measurements design: x are design variables (inputs); b are required results (outputs) x is smallest ( most efficient ) design that satisfies requirements

9 Least Norm Problems: ell_1 norm minimize subject to x Ax = b (A R m n with m n, is a norm on R n ) minimum sum of absolute values ( 1 ): can be solved as an LP minimize 1 T y subject to y x y, Ax = b tends to produce sparse solution x

10 Least Norm Problems: least penalty extension minimize subject to x Ax = b (A R m n with m n, is a norm on R n ) extension: least-penalty problem minimize φ(x 1 )+ + φ(x n ) subject to Ax = b φ : R R is convex penalty function

11 Regularized Approximation minimize (w.r.t. R +) ( Ax b, x ) A R m n, norms on R m and R n can be different interpretation: find good approximation Ax b with small x estimation: linear measurement model y = Ax + v, withprior knowledge that x is small optimal design: smallx is cheaper or more efficient, or the linear model y = Ax is only valid for small x robust approximation: good approximation Ax b with small x is less sensitive to errors in A than good approximation with large x

12 Regularized Approximation minimize Ax b + γ x solution for γ > traces out optimal trade-off curve other common method: minimize Ax b + δ x with δ > Tikhonov regularization minimize Ax b + δ x can be solved as a least-squares problem [ ] minimize δi A x solution x =(A T A + δi) 1 A T b [ b ]

13 Signal Reconstruction minimize (w.r.t. R +) ( ˆx x cor, φ(ˆx)) x R n is unknown signal x cor = x + v is (known) corrupted version of x, with additive noise v variable ˆx (reconstructed signal) is estimate of x φ : R n R is regularization function or smoothing objective examples: quadratic smoothing, total variation smoothing: φ quad (ˆx) = n 1 (ˆx i+1 ˆx i ), φ tv (ˆx) = i=1 n 1 i=1 ˆx i+1 ˆx i

14 Signal Reconstruction: Quadratic Smoothing.5.5 ˆx x ˆx xcor ˆx i 3 original signal x and noisy signal x cor 4 1 i 3 4 three solutions on trade-off curve ˆx x cor versus φ quad (ˆx)

15 Signal Reconstruction: Total Variation Smoothing ˆxi x ˆxi xcor 1 ˆxi 5 1 i 15 original signal x and noisy signal x cor 5 1 i 15 three solutions on trade-off curve ˆx x cor versus φ quad (ˆx) quadratic smoothing smooths out noise and sharp transitions in signal

16 Signal Reconstruction: Total Variation Smoothing ˆx x ˆx xcor 1 ˆx 5 1 i 15 original signal x and noisy signal x cor 5 1 i 15 three solutions on trade-off curve ˆx x cor versus φ tv (ˆx) total variation smoothing preserves sharp transitions in signal

17 Robust Approximation minimize Ax b with uncertain A two approaches: stochastic: assumea is random, minimize E Ax b worst-case: set A of possible values of A, minimizesup A A Ax b tractable only in special cases (certain norms, distributions, sets A)

18 Robust Approximation minimize Ax b with uncertain A two approaches: stochastic: assumea is random, minimize E Ax b worst-case: set A of possible values of A, minimizesup A A Ax b tractable only in special cases (certain norms, distributions, sets A) example: A(u) =A + ua 1 x nom minimizes A x b x nom x stoch minimizes E A(u)x b with u uniform on [ 1, 1] r(u) 6 4 x stoch x wc x wc minimizes sup 1 u 1 A(u)x b figure shows r(u) = A(u)x b 1 1 u

19 Robust Approximation: Stochastic stochastic robust LS with A = Ā + U, U random, E U =, E U T U = P minimize E (Ā + U)x b explicit expression for objective: E Ax b = E Āx b + Ux = Āx b + E x T U T Ux = Āx b + x T Px hence, robust LS problem is equivalent to LS problem minimize Āx b + P 1/ x for P = δi, get Tikhonov regularized problem minimize Āx b + δ x

20 Robust Approximation: Worst-Case worst-case robust LS with A = {Ā + u 1A u p A p u 1} minimize sup A A Ax b =sup u 1 P (x)u + q(x) where P (x) = [ A 1 x A x A p x ], q(x) =Āx b

21 Robust Approximation: Worst-Case worst-case robust LS with A = {Ā + u 1A u p A p u 1} minimize sup A A Ax b =sup u 1 P (x)u + q(x) where P (x) = [ A 1 x A x A p x ], q(x) =Āx b from It can page be shown 5 14, strong duality holds between the following problems maximize Pu+ q subject to u 1 minimize subject to t + λ I P q P T λi q T t hence, robust LS problem is equivalent to SDP minimize subject to t + λ I P(x) q(x) P (x) T λi q(x) T t

22 Statistical Estimation distribution estimation problem: estimate probability density p(y) of a random variable from observed values parametric distribution estimation: choose from a family of densities p x (y), indexedbyaparameterx maximum likelihood estimation maximize (over x) log p x (y) y is observed value l(x) = log p x (y) is called log-likelihood function can add constraints x C explicitly, or define p x (y) =for x C a convex optimization problem if log p x (y) is concave in x for fixed y

23 Statistical Estimation: Linear Measurements with Noise linear measurement model y i = a T i x + v i, i =1,...,m x R n is vector of unknown parameters v i is IID measurement noise, with density p(z) y i is measurement: y R m has density p x (y) = m i=1 p(y i a T i x) maximum likelihood estimate: any solution x of maximize l(x) = m i=1 log p(y i a T i x) (y is observed value)

24 Statistical Estimation: Linear Measurements with Noise Gaussian noise N (, σ ): p(z) =(πσ ) 1/ e z /(σ ), ML estimate is LS solution l(x) = m log(πσ ) 1 σ Laplacian noise: p(z) =(1/(a))e z /a, m (a T i x y i ) i=1 l(x) = m log(a) 1 a ML estimate is l 1 -norm solution uniform noise on [ a, a]: m a T i x y i i=1 l(x) = { m log(a) a T i x y i a, i =1,...,m otherwise ML estimate is any x with a T i x y i a

25 Statistical Estimation: Logistic Regression random variable y {, 1} with distribution p = prob(y = 1) = exp(at u + b) 1+exp(a T u + b) a, b are parameters; u R n are (observable) explanatory variables estimation problem: estimate a, b from m observations (u i,y i ) log-likelihood function (for y 1 = = y k =1, y k+1 = = y m =): k l(a, b) = log exp(a T u i + b) m 1 1+exp(a T u i + b) 1+exp(a T u i + b) i=1 i=k+1 = k (a T u i + b) m log(1 + exp(a T u i + b)) i=1 i=1 concave in a, b

26 Minimum Volume Ellipsoid Around a Set

27 Minimum Volume Ellipsoid Around a Set Löwner-John ellipsoid of a set C: minimum volume ellipsoid E s.t. C E parametrize E as E = {v Av + b 1}; w.l.o.g. assume A S n ++ vol E is proportional to det A 1 ; to compute minimum volume ellipsoid, minimize (over A, b) log det A 1 subject to sup v C Av + b 1 convex, but evaluating the constraint can be hard (for general C)

28 Minimum Volume Ellipsoid Around a Set Löwner-John ellipsoid of a set C: minimum volume ellipsoid E s.t. C E parametrize E as E = {v Av + b 1}; w.l.o.g. assume A S n ++ vol E is proportional to det A 1 ; to compute minimum volume ellipsoid, minimize (over A, b) log det A 1 subject to sup v C Av + b 1 convex, but evaluating the constraint can be hard (for general C) finite set C = {x 1,...,x m }: minimize (over A, b) log det A 1 subject to Ax i + b 1, i =1,...,m also gives Löwner-John ellipsoid for polyhedron conv{x 1,...,x m }

29 Maximum Volume Inscribed Ellipsoid

30 Maximum Volume Inscribed Ellipsoid maximum volume ellipsoid E inside a convex set C R n parametrize E as E = {Bu + d u 1}; w.l.o.g. assume B S n ++ vol E is proportional to det B; can compute E by solving maximize log det B subject to sup u 1 I C (Bu + d) (where I C (x) =for x C and I C (x) = for x C) convex, but evaluating the constraint can be hard (for general C)

31 Maximum Volume Inscribed Ellipsoid maximum volume ellipsoid E inside a convex set C R n parametrize E as E = {Bu + d u 1}; w.l.o.g. assume B S n ++ vol E is proportional to det B; can compute E by solving maximize log det B subject to sup u 1 I C (Bu + d) (where I C (x) =for x C and I C (x) = for x C) convex, but evaluating the constraint can be hard (for general C) polyhedron {x a T i x b i,i=1,...,m}: maximize log det B subject to Ba i + a T i d b i, i =1,...,m (constraint follows from sup u 1 a T i (Bu + d) = Ba i + a T i d)

6. Approximation and fitting

6. Approximation and fitting 6. Approximation and fitting Convex Optimization Boyd & Vandenberghe norm approximation least-norm problems regularized approximation robust approximation 6 Norm approximation minimize Ax b (A R m n with

More information

Convex Optimization M2

Convex Optimization M2 Convex Optimization M2 Lecture 8 A. d Aspremont. Convex Optimization M2. 1/57 Applications A. d Aspremont. Convex Optimization M2. 2/57 Outline Geometrical problems Approximation problems Combinatorial

More information

7. Statistical estimation

7. Statistical estimation 7. Statistical estimation Convex Optimization Boyd & Vandenberghe maximum likelihood estimation optimal detector design experiment design 7 1 Parametric distribution estimation distribution estimation

More information

7. Statistical estimation

7. Statistical estimation 7. Statistical estimation Convex Optimization Boyd & Vandenberghe maximum likelihood estimation optimal detector design experiment design 7 1 Parametric distribution estimation distribution estimation

More information

8. Geometric problems

8. Geometric problems 8. Geometric problems Convex Optimization Boyd & Vandenberghe extremal volume ellipsoids centering classification placement and facility location 8 1 Minimum volume ellipsoid around a set Löwner-John ellipsoid

More information

8. Geometric problems

8. Geometric problems 8. Geometric problems Convex Optimization Boyd & Vandenberghe extremal volume ellipsoids centering classification placement and facility location 8 Minimum volume ellipsoid around a set Löwner-John ellipsoid

More information

9. Geometric problems

9. Geometric problems 9. Geometric problems EE/AA 578, Univ of Washington, Fall 2016 projection on a set extremal volume ellipsoids centering classification 9 1 Projection on convex set projection of point x on set C defined

More information

Nonlinear Programming Models

Nonlinear Programming Models Nonlinear Programming Models Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Nonlinear Programming Models p. Introduction Nonlinear Programming Models p. NLP problems minf(x) x S R n Standard form:

More information

4. Convex optimization problems

4. Convex optimization problems Convex Optimization Boyd & Vandenberghe 4. Convex optimization problems optimization problem in standard form convex optimization problems quasiconvex optimization linear optimization quadratic optimization

More information

Geometric problems. Chapter Projection on a set. The distance of a point x 0 R n to a closed set C R n, in the norm, is defined as

Geometric problems. Chapter Projection on a set. The distance of a point x 0 R n to a closed set C R n, in the norm, is defined as Chapter 8 Geometric problems 8.1 Projection on a set The distance of a point x 0 R n to a closed set C R n, in the norm, is defined as dist(x 0,C) = inf{ x 0 x x C}. The infimum here is always achieved.

More information

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17 EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 17 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory May 29, 2012 Andre Tkacenko

More information

Lecture: Convex Optimization Problems

Lecture: Convex Optimization Problems 1/36 Lecture: Convex Optimization Problems http://bicmr.pku.edu.cn/~wenzw/opt-2015-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghe s lecture notes Introduction 2/36 optimization

More information

4. Convex optimization problems

4. Convex optimization problems Convex Optimization Boyd & Vandenberghe 4. Convex optimization problems optimization problem in standard form convex optimization problems quasiconvex optimization linear optimization quadratic optimization

More information

CSCI5654 (Linear Programming, Fall 2013) Lectures Lectures 10,11 Slide# 1

CSCI5654 (Linear Programming, Fall 2013) Lectures Lectures 10,11 Slide# 1 CSCI5654 (Linear Programming, Fall 2013) Lectures 10-12 Lectures 10,11 Slide# 1 Today s Lecture 1. Introduction to norms: L 1,L 2,L. 2. Casting absolute value and max operators. 3. Norm minimization problems.

More information

Determinant maximization with linear. S. Boyd, L. Vandenberghe, S.-P. Wu. Information Systems Laboratory. Stanford University

Determinant maximization with linear. S. Boyd, L. Vandenberghe, S.-P. Wu. Information Systems Laboratory. Stanford University Determinant maximization with linear matrix inequality constraints S. Boyd, L. Vandenberghe, S.-P. Wu Information Systems Laboratory Stanford University SCCM Seminar 5 February 1996 1 MAXDET problem denition

More information

Convex optimization problems. Optimization problem in standard form

Convex optimization problems. Optimization problem in standard form Convex optimization problems optimization problem in standard form convex optimization problems linear optimization quadratic optimization geometric programming quasiconvex optimization generalized inequality

More information

minimize x x2 2 x 1x 2 x 1 subject to x 1 +2x 2 u 1 x 1 4x 2 u 2, 5x 1 +76x 2 1,

minimize x x2 2 x 1x 2 x 1 subject to x 1 +2x 2 u 1 x 1 4x 2 u 2, 5x 1 +76x 2 1, 4 Duality 4.1 Numerical perturbation analysis example. Consider the quadratic program with variables x 1, x 2, and parameters u 1, u 2. minimize x 2 1 +2x2 2 x 1x 2 x 1 subject to x 1 +2x 2 u 1 x 1 4x

More information

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications

ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications ELE539A: Optimization of Communication Systems Lecture 15: Semidefinite Programming, Detection and Estimation Applications Professor M. Chiang Electrical Engineering Department, Princeton University March

More information

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 18

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 18 EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 18 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory May 31, 2012 Andre Tkacenko

More information

ICS-E4030 Kernel Methods in Machine Learning

ICS-E4030 Kernel Methods in Machine Learning ICS-E4030 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 28. September, 2016 Juho Rousu 28. September, 2016 1 / 38 Convex optimization Convex optimisation This

More information

Advances in Convex Optimization: Theory, Algorithms, and Applications

Advances in Convex Optimization: Theory, Algorithms, and Applications Advances in Convex Optimization: Theory, Algorithms, and Applications Stephen Boyd Electrical Engineering Department Stanford University (joint work with Lieven Vandenberghe, UCLA) ISIT 02 ISIT 02 Lausanne

More information

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44 Convex Optimization Newton s method ENSAE: Optimisation 1/44 Unconstrained minimization minimize f(x) f convex, twice continuously differentiable (hence dom f open) we assume optimal value p = inf x f(x)

More information

CS-E4830 Kernel Methods in Machine Learning

CS-E4830 Kernel Methods in Machine Learning CS-E4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This

More information

Convex Optimization and l 1 -minimization

Convex Optimization and l 1 -minimization Convex Optimization and l 1 -minimization Sangwoon Yun Computational Sciences Korea Institute for Advanced Study December 11, 2009 2009 NIMS Thematic Winter School Outline I. Convex Optimization II. l

More information

Lecture 5 Least-squares

Lecture 5 Least-squares EE263 Autumn 2008-09 Stephen Boyd Lecture 5 Least-squares least-squares (approximate) solution of overdetermined equations projection and orthogonality principle least-squares estimation BLUE property

More information

Short Course Robust Optimization and Machine Learning. 3. Optimization in Supervised Learning

Short Course Robust Optimization and Machine Learning. 3. Optimization in Supervised Learning Short Course Robust Optimization and 3. Optimization in Supervised EECS and IEOR Departments UC Berkeley Spring seminar TRANSP-OR, Zinal, Jan. 16-19, 2012 Outline Overview of Supervised models and variants

More information

Agenda. Interior Point Methods. 1 Barrier functions. 2 Analytic center. 3 Central path. 4 Barrier method. 5 Primal-dual path following algorithms

Agenda. Interior Point Methods. 1 Barrier functions. 2 Analytic center. 3 Central path. 4 Barrier method. 5 Primal-dual path following algorithms Agenda Interior Point Methods 1 Barrier functions 2 Analytic center 3 Central path 4 Barrier method 5 Primal-dual path following algorithms 6 Nesterov Todd scaling 7 Complexity analysis Interior point

More information

12. Interior-point methods

12. Interior-point methods 12. Interior-point methods Convex Optimization Boyd & Vandenberghe inequality constrained minimization logarithmic barrier function and central path barrier method feasibility and phase I methods complexity

More information

Lecture Note 5: Semidefinite Programming for Stability Analysis

Lecture Note 5: Semidefinite Programming for Stability Analysis ECE7850: Hybrid Systems:Theory and Applications Lecture Note 5: Semidefinite Programming for Stability Analysis Wei Zhang Assistant Professor Department of Electrical and Computer Engineering Ohio State

More information

subject to (x 2)(x 4) u,

subject to (x 2)(x 4) u, Exercises Basic definitions 5.1 A simple example. Consider the optimization problem with variable x R. minimize x 2 + 1 subject to (x 2)(x 4) 0, (a) Analysis of primal problem. Give the feasible set, the

More information

Lecture 3. Linear Regression II Bastian Leibe RWTH Aachen

Lecture 3. Linear Regression II Bastian Leibe RWTH Aachen Advanced Machine Learning Lecture 3 Linear Regression II 02.11.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de This Lecture: Advanced Machine Learning Regression

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

MATH 423 Linear Algebra II Lecture 10: Inverse matrix. Change of coordinates.

MATH 423 Linear Algebra II Lecture 10: Inverse matrix. Change of coordinates. MATH 423 Linear Algebra II Lecture 10: Inverse matrix. Change of coordinates. Let V be a vector space and α = [v 1,...,v n ] be an ordered basis for V. Theorem 1 The coordinate mapping C : V F n given

More information

CS295: Convex Optimization. Xiaohui Xie Department of Computer Science University of California, Irvine

CS295: Convex Optimization. Xiaohui Xie Department of Computer Science University of California, Irvine CS295: Convex Optimization Xiaohui Xie Department of Computer Science University of California, Irvine Course information Prerequisites: multivariate calculus and linear algebra Textbook: Convex Optimization

More information

Conditional Gradient (Frank-Wolfe) Method

Conditional Gradient (Frank-Wolfe) Method Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties

More information

Lecture: Examples of LP, SOCP and SDP

Lecture: Examples of LP, SOCP and SDP 1/34 Lecture: Examples of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2018.html wenzw@pku.edu.cn Acknowledgement:

More information

Introduction to Machine Learning Lecture 14. Mehryar Mohri Courant Institute and Google Research

Introduction to Machine Learning Lecture 14. Mehryar Mohri Courant Institute and Google Research Introduction to Machine Learning Lecture 14 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Density Estimation Maxent Models 2 Entropy Definition: the entropy of a random variable

More information

Lecture 6: Conic Optimization September 8

Lecture 6: Conic Optimization September 8 IE 598: Big Data Optimization Fall 2016 Lecture 6: Conic Optimization September 8 Lecturer: Niao He Scriber: Juan Xu Overview In this lecture, we finish up our previous discussion on optimality conditions

More information

Disciplined Convex Programming

Disciplined Convex Programming Disciplined Convex Programming Stephen Boyd Michael Grant Electrical Engineering Department, Stanford University University of Pennsylvania, 3/30/07 Outline convex optimization checking convexity via convex

More information

Motivation Sparse Signal Recovery is an interesting area with many potential applications. Methods developed for solving sparse signal recovery proble

Motivation Sparse Signal Recovery is an interesting area with many potential applications. Methods developed for solving sparse signal recovery proble Bayesian Methods for Sparse Signal Recovery Bhaskar D Rao 1 University of California, San Diego 1 Thanks to David Wipf, Zhilin Zhang and Ritwik Giri Motivation Sparse Signal Recovery is an interesting

More information

minimize x subject to (x 2)(x 4) u,

minimize x subject to (x 2)(x 4) u, Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for

More information

Stochastic Subgradient Methods

Stochastic Subgradient Methods Stochastic Subgradient Methods Stephen Boyd and Almir Mutapcic Notes for EE364b, Stanford University, Winter 26-7 April 13, 28 1 Noisy unbiased subgradient Suppose f : R n R is a convex function. We say

More information

Machine Learning Basics Lecture 2: Linear Classification. Princeton University COS 495 Instructor: Yingyu Liang

Machine Learning Basics Lecture 2: Linear Classification. Princeton University COS 495 Instructor: Yingyu Liang Machine Learning Basics Lecture 2: Linear Classification Princeton University COS 495 Instructor: Yingyu Liang Review: machine learning basics Math formulation Given training data x i, y i : 1 i n i.i.d.

More information

Outline lecture 2 2(30)

Outline lecture 2 2(30) Outline lecture 2 2(3), Lecture 2 Linear Regression it is our firm belief that an understanding of linear models is essential for understanding nonlinear ones Thomas Schön Division of Automatic Control

More information

Convex Optimization. Lecture 12 - Equality Constrained Optimization. Instructor: Yuanzhang Xiao. Fall University of Hawaii at Manoa

Convex Optimization. Lecture 12 - Equality Constrained Optimization. Instructor: Yuanzhang Xiao. Fall University of Hawaii at Manoa Convex Optimization Lecture 12 - Equality Constrained Optimization Instructor: Yuanzhang Xiao University of Hawaii at Manoa Fall 2017 1 / 19 Today s Lecture 1 Basic Concepts 2 for Equality Constrained

More information

Max Margin-Classifier

Max Margin-Classifier Max Margin-Classifier Oliver Schulte - CMPT 726 Bishop PRML Ch. 7 Outline Maximum Margin Criterion Math Maximizing the Margin Non-Separable Data Kernels and Non-linear Mappings Where does the maximization

More information

Lecture 2 Machine Learning Review

Lecture 2 Machine Learning Review Lecture 2 Machine Learning Review CMSC 35246: Deep Learning Shubhendu Trivedi & Risi Kondor University of Chicago March 29, 2017 Things we will look at today Formal Setup for Supervised Learning Things

More information

Convex Optimization in Classification Problems

Convex Optimization in Classification Problems New Trends in Optimization and Computational Algorithms December 9 13, 2001 Convex Optimization in Classification Problems Laurent El Ghaoui Department of EECS, UC Berkeley elghaoui@eecs.berkeley.edu 1

More information

Maximum likelihood estimation

Maximum likelihood estimation Maximum likelihood estimation Guillaume Obozinski Ecole des Ponts - ParisTech Master MVA Maximum likelihood estimation 1/26 Outline 1 Statistical concepts 2 A short review of convex analysis and optimization

More information

Interior Point Methods: Second-Order Cone Programming and Semidefinite Programming

Interior Point Methods: Second-Order Cone Programming and Semidefinite Programming School of Mathematics T H E U N I V E R S I T Y O H F E D I N B U R G Interior Point Methods: Second-Order Cone Programming and Semidefinite Programming Jacek Gondzio Email: J.Gondzio@ed.ac.uk URL: http://www.maths.ed.ac.uk/~gondzio

More information

Lecture 2: Tikhonov-Regularization

Lecture 2: Tikhonov-Regularization Lecture 2: Tikhonov-Regularization Bastian von Harrach harrach@math.uni-stuttgart.de Chair of Optimization and Inverse Problems, University of Stuttgart, Germany Advanced Instructional School on Theoretical

More information

CSCI : Optimization and Control of Networks. Review on Convex Optimization

CSCI : Optimization and Control of Networks. Review on Convex Optimization CSCI7000-016: Optimization and Control of Networks Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one

More information

5. Duality. Lagrangian

5. Duality. Lagrangian 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

ELE539A: Optimization of Communication Systems Lecture 6: Quadratic Programming, Geometric Programming, and Applications

ELE539A: Optimization of Communication Systems Lecture 6: Quadratic Programming, Geometric Programming, and Applications ELE539A: Optimization of Communication Systems Lecture 6: Quadratic Programming, Geometric Programming, and Applications Professor M. Chiang Electrical Engineering Department, Princeton University February

More information

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation. CS 189 Spring 2015 Introduction to Machine Learning Midterm You have 80 minutes for the exam. The exam is closed book, closed notes except your one-page crib sheet. No calculators or electronic items.

More information

Sparse regression. Optimization-Based Data Analysis. Carlos Fernandez-Granda

Sparse regression. Optimization-Based Data Analysis.   Carlos Fernandez-Granda Sparse regression Optimization-Based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_spring16 Carlos Fernandez-Granda 3/28/2016 Regression Least-squares regression Example: Global warming Logistic

More information

10. Unconstrained minimization

10. Unconstrained minimization Convex Optimization Boyd & Vandenberghe 10. Unconstrained minimization terminology and assumptions gradient descent method steepest descent method Newton s method self-concordant functions implementation

More information

Probabilistic Optimal Estimation and Filtering

Probabilistic Optimal Estimation and Filtering Probabilistic Optimal Estimation and Filtering Least Squares and Randomized Algorithms Fabrizio Dabbene 1 Mario Sznaier 2 Roberto Tempo 1 1 CNR - IEIIT Politecnico di Torino 2 Northeastern University Boston

More information

Convex Optimization Boyd & Vandenberghe. 5. Duality

Convex Optimization Boyd & Vandenberghe. 5. Duality 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

The Q-parametrization (Youla) Lecture 13: Synthesis by Convex Optimization. Lecture 13: Synthesis by Convex Optimization. Example: Spring-mass System

The Q-parametrization (Youla) Lecture 13: Synthesis by Convex Optimization. Lecture 13: Synthesis by Convex Optimization. Example: Spring-mass System The Q-parametrization (Youla) Lecture 3: Synthesis by Convex Optimization controlled variables z Plant distubances w Example: Spring-mass system measurements y Controller control inputs u Idea for lecture

More information

Sparse and Robust Optimization and Applications

Sparse and Robust Optimization and Applications Sparse and and Statistical Learning Workshop Les Houches, 2013 Robust Laurent El Ghaoui with Mert Pilanci, Anh Pham EECS Dept., UC Berkeley January 7, 2013 1 / 36 Outline Sparse Sparse Sparse Probability

More information

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST) Lagrange Duality Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline of Lecture Lagrangian Dual function Dual

More information

Subgradient Method. Guest Lecturer: Fatma Kilinc-Karzan. Instructors: Pradeep Ravikumar, Aarti Singh Convex Optimization /36-725

Subgradient Method. Guest Lecturer: Fatma Kilinc-Karzan. Instructors: Pradeep Ravikumar, Aarti Singh Convex Optimization /36-725 Subgradient Method Guest Lecturer: Fatma Kilinc-Karzan Instructors: Pradeep Ravikumar, Aarti Singh Convex Optimization 10-725/36-725 Adapted from slides from Ryan Tibshirani Consider the problem Recall:

More information

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING

EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING EXAM IN STATISTICAL MACHINE LEARNING STATISTISK MASKININLÄRNING DATE AND TIME: August 30, 2018, 14.00 19.00 RESPONSIBLE TEACHER: Niklas Wahlström NUMBER OF PROBLEMS: 5 AIDING MATERIAL: Calculator, mathematical

More information

Agenda. Applications of semidefinite programming. 1 Control and system theory. 2 Combinatorial and nonconvex optimization

Agenda. Applications of semidefinite programming. 1 Control and system theory. 2 Combinatorial and nonconvex optimization Agenda Applications of semidefinite programming 1 Control and system theory 2 Combinatorial and nonconvex optimization 3 Spectral estimation & super-resolution Control and system theory SDP in wide use

More information

Lecture 2: August 31

Lecture 2: August 31 0-704: Information Processing and Learning Fall 206 Lecturer: Aarti Singh Lecture 2: August 3 Note: These notes are based on scribed notes from Spring5 offering of this course. LaTeX template courtesy

More information

Primal-Dual Interior-Point Methods

Primal-Dual Interior-Point Methods Primal-Dual Interior-Point Methods Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 Outline Today: Primal-dual interior-point method Special case: linear programming

More information

Convex Optimization M2

Convex Optimization M2 Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization

More information

Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods.

Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods. Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods. Linear models for classification Logistic regression Gradient descent and second-order methods

More information

Noisy Signal Recovery via Iterative Reweighted L1-Minimization

Noisy Signal Recovery via Iterative Reweighted L1-Minimization Noisy Signal Recovery via Iterative Reweighted L1-Minimization Deanna Needell UC Davis / Stanford University Asilomar SSC, November 2009 Problem Background Setup 1 Suppose x is an unknown signal in R d.

More information

Bayesian Models for Regularization in Optimization

Bayesian Models for Regularization in Optimization Bayesian Models for Regularization in Optimization Aleksandr Aravkin, UBC Bradley Bell, UW Alessandro Chiuso, Padova Michael Friedlander, UBC Gianluigi Pilloneto, Padova Jim Burke, UW MOPTA, Lehigh University,

More information

Lecture: Duality.

Lecture: Duality. Lecture: Duality http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghe s lecture notes Introduction 2/35 Lagrange dual problem weak and strong

More information

Estimation and Optimization: Gaps and Bridges. MURI Meeting June 20, Laurent El Ghaoui. UC Berkeley EECS

Estimation and Optimization: Gaps and Bridges. MURI Meeting June 20, Laurent El Ghaoui. UC Berkeley EECS MURI Meeting June 20, 2001 Estimation and Optimization: Gaps and Bridges Laurent El Ghaoui EECS UC Berkeley 1 goals currently, estimation (of model parameters) and optimization (of decision variables)

More information

Exercises. Exercises. Basic terminology and optimality conditions. 4.2 Consider the optimization problem

Exercises. Exercises. Basic terminology and optimality conditions. 4.2 Consider the optimization problem Exercises Basic terminology and optimality conditions 4.1 Consider the optimization problem f 0(x 1, x 2) 2x 1 + x 2 1 x 1 + 3x 2 1 x 1 0, x 2 0. Make a sketch of the feasible set. For each of the following

More information

IE 521 Convex Optimization

IE 521 Convex Optimization Lecture 1: 16th January 2019 Outline 1 / 20 Which set is different from others? Figure: Four sets 2 / 20 Which set is different from others? Figure: Four sets 3 / 20 Interior, Closure, Boundary Definition.

More information

Lecture 9: Discrete-Time Linear Quadratic Regulator Finite-Horizon Case

Lecture 9: Discrete-Time Linear Quadratic Regulator Finite-Horizon Case Lecture 9: Discrete-Time Linear Quadratic Regulator Finite-Horizon Case Dr. Burak Demirel Faculty of Electrical Engineering and Information Technology, University of Paderborn December 15, 2015 2 Previous

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

FRTN10 Multivariable Control, Lecture 13. Course outline. The Q-parametrization (Youla) Example: Spring-mass System

FRTN10 Multivariable Control, Lecture 13. Course outline. The Q-parametrization (Youla) Example: Spring-mass System FRTN Multivariable Control, Lecture 3 Anders Robertsson Automatic Control LTH, Lund University Course outline The Q-parametrization (Youla) L-L5 Purpose, models and loop-shaping by hand L6-L8 Limitations

More information

Contre-examples for Bayesian MAP restoration. Mila Nikolova

Contre-examples for Bayesian MAP restoration. Mila Nikolova Contre-examples for Bayesian MAP restoration Mila Nikolova CMLA ENS de Cachan, 61 av. du Président Wilson, 94235 Cachan cedex (nikolova@cmla.ens-cachan.fr) Obergurgl, September 26 Outline 1. MAP estimators

More information

Gauge optimization and duality

Gauge optimization and duality 1 / 54 Gauge optimization and duality Junfeng Yang Department of Mathematics Nanjing University Joint with Shiqian Ma, CUHK September, 2015 2 / 54 Outline Introduction Duality Lagrange duality Fenchel

More information

Outline Lecture 2 2(32)

Outline Lecture 2 2(32) Outline Lecture (3), Lecture Linear Regression and Classification it is our firm belief that an understanding of linear models is essential for understanding nonlinear ones Thomas Schön Division of Automatic

More information

Machine Learning Basics Lecture 7: Multiclass Classification. Princeton University COS 495 Instructor: Yingyu Liang

Machine Learning Basics Lecture 7: Multiclass Classification. Princeton University COS 495 Instructor: Yingyu Liang Machine Learning Basics Lecture 7: Multiclass Classification Princeton University COS 495 Instructor: Yingyu Liang Example: image classification indoor Indoor outdoor Example: image classification (multiclass)

More information

Regression. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh.

Regression. Machine Learning and Pattern Recognition. Chris Williams. School of Informatics, University of Edinburgh. Regression Machine Learning and Pattern Recognition Chris Williams School of Informatics, University of Edinburgh September 24 (All of the slides in this course have been adapted from previous versions

More information

Linear and non-linear programming

Linear and non-linear programming Linear and non-linear programming Benjamin Recht March 11, 2005 The Gameplan Constrained Optimization Convexity Duality Applications/Taxonomy 1 Constrained Optimization minimize f(x) subject to g j (x)

More information

SGD and Randomized projection algorithms for overdetermined linear systems

SGD and Randomized projection algorithms for overdetermined linear systems SGD and Randomized projection algorithms for overdetermined linear systems Deanna Needell Claremont McKenna College IPAM, Feb. 25, 2014 Includes joint work with Eldar, Ward, Tropp, Srebro-Ward Setup Setup

More information

New ways of dimension reduction? Cutting data sets into small pieces

New ways of dimension reduction? Cutting data sets into small pieces New ways of dimension reduction? Cutting data sets into small pieces Roman Vershynin University of Michigan, Department of Mathematics Statistical Machine Learning Ann Arbor, June 5, 2012 Joint work with

More information

Newton s Method. Javier Peña Convex Optimization /36-725

Newton s Method. Javier Peña Convex Optimization /36-725 Newton s Method Javier Peña Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, f ( (y) = max y T x f(x) ) x Properties and

More information

Enhanced Compressive Sensing and More

Enhanced Compressive Sensing and More Enhanced Compressive Sensing and More Yin Zhang Department of Computational and Applied Mathematics Rice University, Houston, Texas, U.S.A. Nonlinear Approximation Techniques Using L1 Texas A & M University

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu

More information

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014 Convex Optimization Dani Yogatama School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA February 12, 2014 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12,

More information

Lecture 4: Linear and quadratic problems

Lecture 4: Linear and quadratic problems Lecture 4: Linear and quadratic problems linear programming examples and applications linear fractional programming quadratic optimization problems (quadratically constrained) quadratic programming second-order

More information

10-704: Information Processing and Learning Spring Lecture 8: Feb 5

10-704: Information Processing and Learning Spring Lecture 8: Feb 5 10-704: Information Processing and Learning Spring 2015 Lecture 8: Feb 5 Lecturer: Aarti Singh Scribe: Siheng Chen Disclaimer: These notes have not been subjected to the usual scrutiny reserved for formal

More information

Machine learning - HT Maximum Likelihood

Machine learning - HT Maximum Likelihood Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce

More information

Duality. Geoff Gordon & Ryan Tibshirani Optimization /

Duality. Geoff Gordon & Ryan Tibshirani Optimization / Duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Duality in linear programs Suppose we want to find lower bound on the optimal value in our convex problem, B min x C f(x) E.g., consider

More information

Machine Learning. Support Vector Machines. Manfred Huber

Machine Learning. Support Vector Machines. Manfred Huber Machine Learning Support Vector Machines Manfred Huber 2015 1 Support Vector Machines Both logistic regression and linear discriminant analysis learn a linear discriminant function to separate the data

More information

Relaxations and Randomized Methods for Nonconvex QCQPs

Relaxations and Randomized Methods for Nonconvex QCQPs Relaxations and Randomized Methods for Nonconvex QCQPs Alexandre d Aspremont, Stephen Boyd EE392o, Stanford University Autumn, 2003 Introduction While some special classes of nonconvex problems can be

More information

Linear Models for Classification

Linear Models for Classification Linear Models for Classification Oliver Schulte - CMPT 726 Bishop PRML Ch. 4 Classification: Hand-written Digit Recognition CHINE INTELLIGENCE, VOL. 24, NO. 24, APRIL 2002 x i = t i = (0, 0, 0, 1, 0, 0,

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Large-Scale L1-Related Minimization in Compressive Sensing and Beyond

Large-Scale L1-Related Minimization in Compressive Sensing and Beyond Large-Scale L1-Related Minimization in Compressive Sensing and Beyond Yin Zhang Department of Computational and Applied Mathematics Rice University, Houston, Texas, U.S.A. Arizona State University March

More information

Statistical Data Mining and Machine Learning Hilary Term 2016

Statistical Data Mining and Machine Learning Hilary Term 2016 Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes

More information