Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014

Size: px
Start display at page:

Download "Convex Optimization. Dani Yogatama. School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA. February 12, 2014"

Transcription

1 Convex Optimization Dani Yogatama School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA February 12, 2014 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

2 Key Concepts in Convex Analysis: Convex Sets Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

3 Key Concepts in Convex Analysis: Convex Functions Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

4 Key Concepts in Convex Analysis: Minimizers Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

5 Key Concepts in Convex Analysis: Strong Convexity Recall the definition of convex function: λ [0, 1], f (λx + (1 λ)x ) λf (x) + (1 λ)f (x ) convexity Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

6 Key Concepts in Convex Analysis: Strong Convexity Recall the definition of convex function: λ [0, 1], f (λx + (1 λ)x ) λf (x) + (1 λ)f (x ) A β strongly convex function satisfies a stronger condition: λ [0, 1] f (λx + (1 λ)x ) λf (x) + (1 λ)f (x ) β 2 λ(1 λ) x x 2 2 convexity strong convexity Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

7 Key Concepts in Convex Analysis: Strong Convexity Recall the definition of convex function: λ [0, 1], f (λx + (1 λ)x ) λf (x) + (1 λ)f (x ) A β strongly convex function satisfies a stronger condition: λ [0, 1] f (λx + (1 λ)x ) λf (x) + (1 λ)f (x ) β 2 λ(1 λ) x x 2 2 convexity Strong convexity strict convexity. strong convexity Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

8 Key Concepts in Convex Analysis: Subgradients Convexity continuity; convexity differentiability (e.g., f (w) = w 1 ). Subgradients generalize gradients for (maybe non-diff.) convex functions: Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

9 Key Concepts in Convex Analysis: Subgradients Convexity continuity; convexity differentiability (e.g., f (w) = w 1 ). Subgradients generalize gradients for (maybe non-diff.) convex functions: v is a subgradient of f at x if f (x ) f (x) + v (x x) Subdifferential: f (x) = {v : v is a subgradient of f at x} linear lower bound Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

10 Key Concepts in Convex Analysis: Subgradients Convexity continuity; convexity differentiability (e.g., f (w) = w 1 ). Subgradients generalize gradients for (maybe non-diff.) convex functions: v is a subgradient of f at x if f (x ) f (x) + v (x x) Subdifferential: f (x) = {v : v is a subgradient of f at x} If f is differentiable, f (x) = { f (x)} linear lower bound Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

11 Key Concepts in Convex Analysis: Subgradients Convexity continuity; convexity differentiability (e.g., f (w) = w 1 ). Subgradients generalize gradients for (maybe non-diff.) convex functions: v is a subgradient of f at x if f (x ) f (x) + v (x x) Subdifferential: f (x) = {v : v is a subgradient of f at x} If f is differentiable, f (x) = { f (x)} linear lower bound non-differentiable case Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

12 Key Concepts in Convex Analysis: Subgradients Convexity continuity; convexity differentiability (e.g., f (w) = w 1 ). Subgradients generalize gradients for (maybe non-diff.) convex functions: v is a subgradient of f at x if f (x ) f (x) + v (x x) Subdifferential: f (x) = {v : v is a subgradient of f at x} If f is differentiable, f (x) = { f (x)} linear lower bound non-differentiable case Notation: f (x) is a subgradient of f at x Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

13 Establishing convexity How to check if f (x) is a convex function? Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

14 Establishing convexity How to check if f (x) is a convex function? Verify definition of a convex function. Check if 2 f (x) greater than or equal to 0 (for twice differentiable 2 x function). Show that it is constructed from simple convex functions with operations that preserver convexity. nonnegative weighted sum composition with affine function pointwise maximum and supremum composition minimization perspective Reference: Boyd and Vandenberghe (2004) Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

15 Unconstrained Optimization Algorithms: First order methods (gradient descent, FISTA, etc.) Higher order methods (Newton s method, ellipsoid, etc.)... Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

16 Gradient descent Problem: min f (x) x Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

17 Gradient descent Problem: min f (x) x Algorithm: g t = f (xt) x. x t = x t 1 ηg t. Repeat until convergence. Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

18 Newton s method Problem: Assume f is twice differentiable. min f (x) x Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

19 Newton s method Problem: Assume f is twice differentiable. Algorithm: g t = f (xt) x. min f (x) x s t = H 1 g t, where H is the Hessian. x t = x t 1 ηs t. Repeat until convergence. Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

20 Newton s method Problem: Assume f is twice differentiable. Algorithm: g t = f (xt) x. min f (x) x s t = H 1 g t, where H is the Hessian. x t = x t 1 ηs t. Repeat until convergence. Newton s method is a special case of steepest descent using Hessian norm. Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

21 Primal problem: Duality min x f (x) subject to g i (x) 0 i = 1,..., m h i (x) = 0 i = 1,..., p for x X. Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

22 Primal problem: Duality min x f (x) subject to g i (x) 0 i = 1,..., m h i (x) = 0 i = 1,..., p for x X. Lagrangian: m L(x, λ, ν) = f (x) + λ i g i (x) + i=1 λ i and ν i are Lagrange multipliers. p ν i h i (x) i=1 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

23 Primal problem: Duality min x f (x) subject to g i (x) 0 i = 1,..., m h i (x) = 0 i = 1,..., p for x X. Lagrangian: m L(x, λ, ν) = f (x) + λ i g i (x) + i=1 λ i and ν i are Lagrange multipliers. p ν i h i (x) Suppose x is feasible and λ 0, then we get the lower bound: f (x) L(x, λ, ν) x X, λ R m + i=1 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

24 Duality Primal optimal: p = min x max L(x, λ, ν) λ 0,ν Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

25 Duality Primal optimal: p = min x max L(x, λ, ν) λ 0,ν Lagrange dual function: min x L(x, λ, ν) This is a concave function, regardless of whether f (x) convex or not. Can be for some λ and ν. Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

26 Duality Primal optimal: p = min x max L(x, λ, ν) λ 0,ν Lagrange dual function: min x L(x, λ, ν) This is a concave function, regardless of whether f (x) convex or not. Can be for some λ and ν. Lagrange dual problem: max L(x, λ, ν) subject to λ 0 λ,ν Dual feasible: if λ 0 and λ, ν dom L(x, λ, ν). Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

27 Duality Primal optimal: p = min x max L(x, λ, ν) λ 0,ν Lagrange dual function: min x L(x, λ, ν) This is a concave function, regardless of whether f (x) convex or not. Can be for some λ and ν. Lagrange dual problem: max L(x, λ, ν) subject to λ 0 λ,ν Dual feasible: if λ 0 and λ, ν dom L(x, λ, ν). Dual optimal: d = max min λ 0,ν x L(x, λ, ν) Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

28 Duality Weak duality p d always holds for convex and nonconvex problems Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

29 Duality Weak duality p d always holds for convex and nonconvex problems Strong duality p = d does not hold in general, but usually holds for convex problems. Strong duality is guaranteed by Slater s constraint qualification. Strong duality holds if the problem is strictly feasible, i.e. x int D s.t. g i (x) < 0, i = 1,..., m, h i (x) = 0, i = 1,..., p Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

30 Duality Weak duality p d always holds for convex and nonconvex problems Strong duality p = d does not hold in general, but usually holds for convex problems. Strong duality is guaranteed by Slater s constraint qualification. Strong duality holds if the problem is strictly feasible, i.e. x int D s.t. g i (x) < 0, i = 1,..., m, h i (x) = 0, i = 1,..., p Assume strong duality holds and p and d are attained. p = f (x ) = d = min L(x, λ, ν ) L(x, λ, ν ) f (x ) = p x Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

31 Duality Weak duality p d always holds for convex and nonconvex problems Strong duality p = d does not hold in general, but usually holds for convex problems. Strong duality is guaranteed by Slater s constraint qualification. Strong duality holds if the problem is strictly feasible, i.e. x int D s.t. g i (x) < 0, i = 1,..., m, h i (x) = 0, i = 1,..., p Assume strong duality holds and p and d are attained. p = f (x ) = d = min L(x, λ, ν ) L(x, λ, ν ) f (x ) = p x We have: x arg min x L(x, λ, ν ). λ i g i(x ) = 0 for i = 1,..., m (complementary slackness). Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

32 Karush-Kuhn-Tucker condition For a differentiable g(x) and h(x), the KKT conditions are: g i (x ) 0, h i (x ) = 0, λ i 0, λ i g i (x ) = 0, primal feasibility dual feasibility complementary slackness L(x, λ, ν ) x=x = 0 Lagrangian stationarity x Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

33 Karush-Kuhn-Tucker condition For a differentiable g(x) and h(x), the KKT conditions are: g i (x ) 0, h i (x ) = 0, λ i 0, λ i g i (x ) = 0, primal feasibility dual feasibility complementary slackness L(x, λ, ν ) x=x = 0 Lagrangian stationarity x If ˆx, ˆλ, ˆν satify the KKT for a convex problem, they are optimal. Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

34 Support Vector Machines Primal problem (hard constraint): 1 min w 2 w 2 2 subject to y i x i, w 1, i = 1,..., n Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

35 Support Vector Machines Primal problem (hard constraint): 1 min w 2 w 2 2 subject to y i x i, w 1, i = 1,..., n Lagrangian: L(w, λ) = 1 n 2 w 2 2 λ i (y i x i, w 1) i=1 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

36 Support Vector Machines Primal problem (hard constraint): 1 min w 2 w 2 2 subject to y i x i, w 1, i = 1,..., n Lagrangian: L(w, λ) = 1 n 2 w 2 2 λ i (y i x i, w 1) i=1 Minimizing with respect to w, we have: L(w,λ) w = 0 w n i=1 λ iy i x i = 0 n w = λ i y i x i i=1 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

37 Plug this back into the Lagrangian: Support Vector Machines L(λ) = n λ i 1 2 1=1 n n y i y j λ i λ j xi i=1 j=1 x j Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

38 Plug this back into the Lagrangian: Support Vector Machines L(λ) = Lagrange dual problem is: n λ i 1 2 1=1 n n y i y j λ i λ j xi i=1 j=1 x j max λ subject to n λ i 1 2 1=1 n n y i y j λ i λ j xi i=1 j=1 λ i 0, i = 1,..., n n λ i y i = 0 i=1 x j Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

39 Plug this back into the Lagrangian: Support Vector Machines L(λ) = Lagrange dual problem is: n λ i 1 2 1=1 n n y i y j λ i λ j xi i=1 j=1 x j max λ subject to n λ i 1 2 1=1 n n y i y j λ i λ j xi i=1 j=1 λ i 0, i = 1,..., n n λ i y i = 0 i=1 x j Since this problem only depends on xi x j, we can use kernels and learn in high dimensional space without having to explicitly represent φ(x). Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

40 Support Vector Machines Primal problem (soft constraint): min w 1 2 w C subject to n i=1 y i x i, w 1 ξ i, i = 1,..., n ξ i ξ i 0, i = 1..., n Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

41 Support Vector Machines Lagrange dual problem for the soft constraint: max λ n λ i 1 n n y i y j λ i λ j xi x j (1) 2 1=1 i=1 j=1 subject to 0 λ i C, i = 1,..., n (2) n λ i y i = 0 (3) i=1 Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

42 Support Vector Machines Lagrange dual problem for the soft constraint: max λ n λ i 1 n n y i y j λ i λ j xi x j (1) 2 1=1 i=1 j=1 subject to 0 λ i C, i = 1,..., n (2) n λ i y i = 0 (3) i=1 KKT conditions, for all i: λ i = 0 y i x i, w 1 (4) 0 < λ i < C y i x i, w = 1 (5) λ i = C y i x i, w 1 (6) Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

43 Sequential Minimal Optimization (Platt, 1998) An efficient way to solve SVM dual problem. Break a large QP program into a series of smallest possible QP problems. Solve these small subproblems analytically. Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

44 Sequential Minimal Optimization (Platt, 1998) An efficient way to solve SVM dual problem. Break a large QP program into a series of smallest possible QP problems. Solve these small subproblems analytically. In a nutshell Choose two Lagrange multipliers λ i and λ j. Optimize the dual problem with respect to these two Lagrange multipliers, holding others fixed. Repeat until convergence. Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

45 Sequential Minimal Optimization (Platt, 1998) An efficient way to solve SVM dual problem. Break a large QP program into a series of smallest possible QP problems. Solve these small subproblems analytically. In a nutshell Choose two Lagrange multipliers λ i and λ j. Optimize the dual problem with respect to these two Lagrange multipliers, holding others fixed. Repeat until convergence. There are heuristics to choose Lagrange multipliers that maximizes the step size towards the global maximum. The first one is chosen from examples that violate the KKT condition. The second one is chosen using approximation based on absolute difference in error values (see Platt (1998)). Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

46 Sequential Minimal Optimization (Platt, 1998) Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

47 Sequential Minimal Optimization (Platt, 1998) For any two Lagrange multipliers, the constraints are:: 0 < λ i, λ j < C (7) y i λ i + y j λ j = k i,j y kλ k = γ (8) Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

48 Sequential Minimal Optimization (Platt, 1998) For any two Lagrange multipliers, the constraints are:: 0 < λ i, λ j < C (7) y i λ i + y j λ j = k i,j y kλ k = γ (8) Express λ i in terms of λ j λ i = γ λ jy j y i Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

49 Sequential Minimal Optimization (Platt, 1998) For any two Lagrange multipliers, the constraints are:: 0 < λ i, λ j < C (7) y i λ i + y j λ j = k i,j y kλ k = γ (8) Express λ i in terms of λ j λ i = γ λ jy j y i Plug this back into our original function. We are then left with a very simple quadratic problem with one variable λ j Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

50 Sequential Minimal Optimization (Platt, 1998) Solve for the second Lagrange multiplier λ j. Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

51 Sequential Minimal Optimization (Platt, 1998) Solve for the second Lagrange multiplier λ j. If y i y j, the following bounds apply to λ j : L H = max(0, λ t 1 j = min(c, C + λ t 1 j λ y 1 i ) (9) λ y 1 i ) (10) If y i = y j, the following bounds apply to λ j : L H = max(0, λ t 1 j = min(c, λ t 1 j + λ y 1 i C) (11) + λ y 1 i ) (12) Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

52 Sequential Minimal Optimization (Platt, 1998) Solve for the second Lagrange multiplier λ j. If y i y j, the following bounds apply to λ j : L H = max(0, λ t 1 j = min(c, C + λ t 1 j λ y 1 i ) (9) λ y 1 i ) (10) If y i = y j, the following bounds apply to λ j : L H = max(0, λ t 1 j = min(c, λ t 1 j + λ y 1 i C) (11) + λ y 1 i ) (12) The solution is: H λ j = λ j L ifλ j > H ifl λ j H ifλ j < L Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

53 Fenchel duality If a convex conjugate of f (x) is known, the dual function can be easily derived. The convex conjugate of a function f is: f (y) = max y, x f (x) (13) x Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

54 Fenchel duality If a convex conjugate of f (x) is known, the dual function can be easily derived. The convex conjugate of a function f is: For a generic problem f (y) = max y, x f (x) (13) x min x subject to f (x) Ax b Cx = d The dual function is: f ( A λ C ν) b λ d ν Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

55 Fenchel duality If a convex conjugate of f (x) is known, the dual function can be easily derived. The convex conjugate of a function f is: For a generic problem f (y) = max y, x f (x) (13) x min x subject to f (x) Ax b Cx = d The dual function is: f ( A λ C ν) b λ d ν There are many functions whose conjugate are easy to compute: Exponential Logistic Quadratic form Log determinant... Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

56 Parting notes Dual formulation is useful. Give new insights into our problem, Allow us to develop better optimization methods and use kernel tricks. Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

57 Thank you! Questions? Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

58 References I Some slides are from an upcoming EACL 2014 tutorial with Andre F. T. Martins, Noah A. Smith, and Mario F. T. Figueiredo Boyd, S. and Vandenberghe, L. (2004). Convex Optimization. Cambridge University Press. Platt, J. (1998). Fast training of support vector machines using sequential minimal optimization. In Scholkopf, B., Burges, C., and Smola, A., editors, Advances in Kernel Methods - Support Vector Learning. MIT Press. Dani Yogatama (Carnegie Mellon University) Convex Optimization February 12, / 26

CS-E4830 Kernel Methods in Machine Learning

CS-E4830 Kernel Methods in Machine Learning CS-E4830 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 27. September, 2017 Juho Rousu 27. September, 2017 1 / 45 Convex optimization Convex optimisation This

More information

ICS-E4030 Kernel Methods in Machine Learning

ICS-E4030 Kernel Methods in Machine Learning ICS-E4030 Kernel Methods in Machine Learning Lecture 3: Convex optimization and duality Juho Rousu 28. September, 2016 Juho Rousu 28. September, 2016 1 / 38 Convex optimization Convex optimisation This

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

5. Duality. Lagrangian

5. Duality. Lagrangian 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

Lecture: Duality of LP, SOCP and SDP

Lecture: Duality of LP, SOCP and SDP 1/33 Lecture: Duality of LP, SOCP and SDP Zaiwen Wen Beijing International Center For Mathematical Research Peking University http://bicmr.pku.edu.cn/~wenzw/bigdata2017.html wenzw@pku.edu.cn Acknowledgement:

More information

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness.

14. Duality. ˆ Upper and lower bounds. ˆ General duality. ˆ Constraint qualifications. ˆ Counterexample. ˆ Complementary slackness. CS/ECE/ISyE 524 Introduction to Optimization Spring 2016 17 14. Duality ˆ Upper and lower bounds ˆ General duality ˆ Constraint qualifications ˆ Counterexample ˆ Complementary slackness ˆ Examples ˆ Sensitivity

More information

Convex Optimization Boyd & Vandenberghe. 5. Duality

Convex Optimization Boyd & Vandenberghe. 5. Duality 5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

Lecture 18: Optimization Programming

Lecture 18: Optimization Programming Fall, 2016 Outline Unconstrained Optimization 1 Unconstrained Optimization 2 Equality-constrained Optimization Inequality-constrained Optimization Mixture-constrained Optimization 3 Quadratic Programming

More information

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The

More information

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research

Introduction to Machine Learning Lecture 7. Mehryar Mohri Courant Institute and Google Research Introduction to Machine Learning Lecture 7 Mehryar Mohri Courant Institute and Google Research mohri@cims.nyu.edu Convex Optimization Differentiation Definition: let f : X R N R be a differentiable function,

More information

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Karush-Kuhn-Tucker Conditions. Lecturer: Ryan Tibshirani Convex Optimization /36-725 Karush-Kuhn-Tucker Conditions Lecturer: Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given a minimization problem Last time: duality min x subject to f(x) h i (x) 0, i = 1,... m l j (x) = 0, j =

More information

The Lagrangian L : R d R m R r R is an (easier to optimize) lower bound on the original problem:

The Lagrangian L : R d R m R r R is an (easier to optimize) lower bound on the original problem: HT05: SC4 Statistical Data Mining and Machine Learning Dino Sejdinovic Department of Statistics Oxford Convex Optimization and slides based on Arthur Gretton s Advanced Topics in Machine Learning course

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

More information

Lecture: Duality.

Lecture: Duality. Lecture: Duality http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghe s lecture notes Introduction 2/35 Lagrange dual problem weak and strong

More information

Machine Learning. Support Vector Machines. Manfred Huber

Machine Learning. Support Vector Machines. Manfred Huber Machine Learning Support Vector Machines Manfred Huber 2015 1 Support Vector Machines Both logistic regression and linear discriminant analysis learn a linear discriminant function to separate the data

More information

Support vector machines

Support vector machines Support vector machines Guillaume Obozinski Ecole des Ponts - ParisTech SOCN course 2014 SVM, kernel methods and multiclass 1/23 Outline 1 Constrained optimization, Lagrangian duality and KKT 2 Support

More information

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST)

Lagrange Duality. Daniel P. Palomar. Hong Kong University of Science and Technology (HKUST) Lagrange Duality Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2017-18, HKUST, Hong Kong Outline of Lecture Lagrangian Dual function Dual

More information

Subgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus

Subgradient. Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes. definition. subgradient calculus 1/41 Subgradient Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes definition subgradient calculus duality and optimality conditions directional derivative Basic inequality

More information

Convex Optimization M2

Convex Optimization M2 Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization

More information

Convex Optimization & Lagrange Duality

Convex Optimization & Lagrange Duality Convex Optimization & Lagrange Duality Chee Wei Tan CS 8292 : Advanced Topics in Convex Optimization and its Applications Fall 2010 Outline Convex optimization Optimality condition Lagrange duality KKT

More information

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient

Shiqian Ma, MAT-258A: Numerical Optimization 1. Chapter 4. Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 1 Chapter 4 Subgradient Shiqian Ma, MAT-258A: Numerical Optimization 2 4.1. Subgradients definition subgradient calculus duality and optimality conditions Shiqian

More information

Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities

Duality. Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities Duality Lagrange dual problem weak and strong duality optimality conditions perturbation and sensitivity analysis generalized inequalities Lagrangian Consider the optimization problem in standard form

More information

On the Method of Lagrange Multipliers

On the Method of Lagrange Multipliers On the Method of Lagrange Multipliers Reza Nasiri Mahalati November 6, 2016 Most of what is in this note is taken from the Convex Optimization book by Stephen Boyd and Lieven Vandenberghe. This should

More information

Machine Learning. Lecture 6: Support Vector Machine. Feng Li.

Machine Learning. Lecture 6: Support Vector Machine. Feng Li. Machine Learning Lecture 6: Support Vector Machine Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2018 Warm Up 2 / 80 Warm Up (Contd.)

More information

Lecture 6: Conic Optimization September 8

Lecture 6: Conic Optimization September 8 IE 598: Big Data Optimization Fall 2016 Lecture 6: Conic Optimization September 8 Lecturer: Niao He Scriber: Juan Xu Overview In this lecture, we finish up our previous discussion on optimality conditions

More information

Optimality Conditions for Constrained Optimization

Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

More information

Primal/Dual Decomposition Methods

Primal/Dual Decomposition Methods Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients

More information

A Brief Review on Convex Optimization

A Brief Review on Convex Optimization A Brief Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one convex, two nonconvex sets): A Brief Review

More information

Support Vector Machines and Kernel Methods

Support Vector Machines and Kernel Methods 2018 CS420 Machine Learning, Lecture 3 Hangout from Prof. Andrew Ng. http://cs229.stanford.edu/notes/cs229-notes3.pdf Support Vector Machines and Kernel Methods Weinan Zhang Shanghai Jiao Tong University

More information

CSCI : Optimization and Control of Networks. Review on Convex Optimization

CSCI : Optimization and Control of Networks. Review on Convex Optimization CSCI7000-016: Optimization and Control of Networks Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one

More information

Lagrange duality. The Lagrangian. We consider an optimization program of the form

Lagrange duality. The Lagrangian. We consider an optimization program of the form Lagrange duality Another way to arrive at the KKT conditions, and one which gives us some insight on solving constrained optimization problems, is through the Lagrange dual. The dual is a maximization

More information

Support Vector Machines: Maximum Margin Classifiers

Support Vector Machines: Maximum Margin Classifiers Support Vector Machines: Maximum Margin Classifiers Machine Learning and Pattern Recognition: September 16, 2008 Piotr Mirowski Based on slides by Sumit Chopra and Fu-Jie Huang 1 Outline What is behind

More information

EE/AA 578, Univ of Washington, Fall Duality

EE/AA 578, Univ of Washington, Fall Duality 7. Duality EE/AA 578, Univ of Washington, Fall 2016 Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized

More information

Lecture 3. Optimization Problems and Iterative Algorithms

Lecture 3. Optimization Problems and Iterative Algorithms Lecture 3 Optimization Problems and Iterative Algorithms January 13, 2016 This material was jointly developed with Angelia Nedić at UIUC for IE 598ns Outline Special Functions: Linear, Quadratic, Convex

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Support vector machines (SVMs) are one of the central concepts in all of machine learning. They are simply a combination of two ideas: linear classification via maximum (or optimal

More information

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

More information

Convex Optimization and SVM

Convex Optimization and SVM Convex Optimization and SVM Problem 0. Cf lecture notes pages 12 to 18. Problem 1. (i) A slab is an intersection of two half spaces, hence convex. (ii) A wedge is an intersection of two half spaces, hence

More information

Introduction to Machine Learning Spring 2018 Note Duality. 1.1 Primal and Dual Problem

Introduction to Machine Learning Spring 2018 Note Duality. 1.1 Primal and Dual Problem CS 189 Introduction to Machine Learning Spring 2018 Note 22 1 Duality As we have seen in our discussion of kernels, ridge regression can be viewed in two ways: (1) an optimization problem over the weights

More information

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem:

Motivation. Lecture 2 Topics from Optimization and Duality. network utility maximization (NUM) problem: CDS270 Maryam Fazel Lecture 2 Topics from Optimization and Duality Motivation network utility maximization (NUM) problem: consider a network with S sources (users), each sending one flow at rate x s, through

More information

Lecture Notes on Support Vector Machine

Lecture Notes on Support Vector Machine Lecture Notes on Support Vector Machine Feng Li fli@sdu.edu.cn Shandong University, China 1 Hyperplane and Margin In a n-dimensional space, a hyper plane is defined by ω T x + b = 0 (1) where ω R n is

More information

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers

Optimization for Communications and Networks. Poompat Saengudomlert. Session 4 Duality and Lagrange Multipliers Optimization for Communications and Networks Poompat Saengudomlert Session 4 Duality and Lagrange Multipliers P Saengudomlert (2015) Optimization Session 4 1 / 14 24 Dual Problems Consider a primal convex

More information

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be

More information

10701 Recitation 5 Duality and SVM. Ahmed Hefny

10701 Recitation 5 Duality and SVM. Ahmed Hefny 10701 Recitation 5 Duality and SVM Ahmed Hefny Outline Langrangian and Duality The Lagrangian Duality Eamples Support Vector Machines Primal Formulation Dual Formulation Soft Margin and Hinge Loss Lagrangian

More information

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010

I.3. LMI DUALITY. Didier HENRION EECI Graduate School on Control Supélec - Spring 2010 I.3. LMI DUALITY Didier HENRION henrion@laas.fr EECI Graduate School on Control Supélec - Spring 2010 Primal and dual For primal problem p = inf x g 0 (x) s.t. g i (x) 0 define Lagrangian L(x, z) = g 0

More information

Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)

Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers) Support vector machines In a nutshell Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers) Solution only depends on a small subset of training

More information

Support Vector Machine

Support Vector Machine Andrea Passerini passerini@disi.unitn.it Machine Learning Support vector machines In a nutshell Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)

More information

Convex Optimization Overview (cnt d)

Convex Optimization Overview (cnt d) Conve Optimization Overview (cnt d) Chuong B. Do November 29, 2009 During last week s section, we began our study of conve optimization, the study of mathematical optimization problems of the form, minimize

More information

Lagrangian Duality and Convex Optimization

Lagrangian Duality and Convex Optimization Lagrangian Duality and Convex Optimization David Rosenberg New York University February 11, 2015 David Rosenberg (New York University) DS-GA 1003 February 11, 2015 1 / 24 Introduction Why Convex Optimization?

More information

Gaps in Support Vector Optimization

Gaps in Support Vector Optimization Gaps in Support Vector Optimization Nikolas List 1 (student author), Don Hush 2, Clint Scovel 2, Ingo Steinwart 2 1 Lehrstuhl Mathematik und Informatik, Ruhr-University Bochum, Germany nlist@lmi.rub.de

More information

Lecture Note 5: Semidefinite Programming for Stability Analysis

Lecture Note 5: Semidefinite Programming for Stability Analysis ECE7850: Hybrid Systems:Theory and Applications Lecture Note 5: Semidefinite Programming for Stability Analysis Wei Zhang Assistant Professor Department of Electrical and Computer Engineering Ohio State

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression

More information

Support Vector Machine (continued)

Support Vector Machine (continued) Support Vector Machine continued) Overlapping class distribution: In practice the class-conditional distributions may overlap, so that the training data points are no longer linearly separable. We need

More information

Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)

Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers) Support vector machines In a nutshell Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers) Solution only depends on a small subset of training

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Statistical Inference with Reproducing Kernel Hilbert Space Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department of Statistical Science, Graduate University for

More information

Lecture 2: Linear SVM in the Dual

Lecture 2: Linear SVM in the Dual Lecture 2: Linear SVM in the Dual Stéphane Canu stephane.canu@litislab.eu São Paulo 2015 July 22, 2015 Road map 1 Linear SVM Optimization in 10 slides Equality constraints Inequality constraints Dual formulation

More information

SMO Algorithms for Support Vector Machines without Bias Term

SMO Algorithms for Support Vector Machines without Bias Term Institute of Automatic Control Laboratory for Control Systems and Process Automation Prof. Dr.-Ing. Dr. h. c. Rolf Isermann SMO Algorithms for Support Vector Machines without Bias Term Michael Vogt, 18-Jul-2002

More information

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2016 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2015 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

Lectures 9 and 10: Constrained optimization problems and their optimality conditions

Lectures 9 and 10: Constrained optimization problems and their optimality conditions Lectures 9 and 10: Constrained optimization problems and their optimality conditions Coralia Cartis, Mathematical Institute, University of Oxford C6.2/B2: Continuous Optimization Lectures 9 and 10: Constrained

More information

Lecture Support Vector Machine (SVM) Classifiers

Lecture Support Vector Machine (SVM) Classifiers Introduction to Machine Learning Lecturer: Amir Globerson Lecture 6 Fall Semester Scribe: Yishay Mansour 6.1 Support Vector Machine (SVM) Classifiers Classification is one of the most important tasks in

More information

Statistical Machine Learning from Data

Statistical Machine Learning from Data Samy Bengio Statistical Machine Learning from Data 1 Statistical Machine Learning from Data Support Vector Machines Samy Bengio IDIAP Research Institute, Martigny, Switzerland, and Ecole Polytechnique

More information

Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers)

Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers) Support vector machines In a nutshell Linear classifiers selecting hyperplane maximizing separation margin between classes (large margin classifiers) Solution only depends on a small subset of training

More information

Constrained Optimization

Constrained Optimization 1 / 22 Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University March 30, 2015 2 / 22 1. Equality constraints only 1.1 Reduced gradient 1.2 Lagrange

More information

Convex Optimization and Modeling

Convex Optimization and Modeling Convex Optimization and Modeling Duality Theory and Optimality Conditions 5th lecture, 12.05.2010 Jun.-Prof. Matthias Hein Program of today/next lecture Lagrangian and duality: the Lagrangian the dual

More information

Lecture 7: Convex Optimizations

Lecture 7: Convex Optimizations Lecture 7: Convex Optimizations Radu Balan, David Levermore March 29, 2018 Convex Sets. Convex Functions A set S R n is called a convex set if for any points x, y S the line segment [x, y] := {tx + (1

More information

Math 273a: Optimization Subgradients of convex functions

Math 273a: Optimization Subgradients of convex functions Math 273a: Optimization Subgradients of convex functions Made by: Damek Davis Edited by Wotao Yin Department of Mathematics, UCLA Fall 2015 online discussions on piazza.com 1 / 42 Subgradients Assumptions

More information

minimize x subject to (x 2)(x 4) u,

minimize x subject to (x 2)(x 4) u, Math 6366/6367: Optimization and Variational Methods Sample Preliminary Exam Questions 1. Suppose that f : [, L] R is a C 2 -function with f () on (, L) and that you have explicit formulae for

More information

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints

ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints ISM206 Lecture Optimization of Nonlinear Objective with Linear Constraints Instructor: Prof. Kevin Ross Scribe: Nitish John October 18, 2011 1 The Basic Goal The main idea is to transform a given constrained

More information

Solving the SVM Optimization Problem

Solving the SVM Optimization Problem Solving the SVM Optimization Problem Kernel-based Learning Methods Christian Igel Institut für Neuroinformatik Ruhr-Universität Bochum, Germany http://www.neuroinformatik.rub.de July 16, 2009 Christian

More information

Tutorial on Convex Optimization for Engineers Part II

Tutorial on Convex Optimization for Engineers Part II Tutorial on Convex Optimization for Engineers Part II M.Sc. Jens Steinwandt Communications Research Laboratory Ilmenau University of Technology PO Box 100565 D-98684 Ilmenau, Germany jens.steinwandt@tu-ilmenau.de

More information

Support Vector Machine I

Support Vector Machine I Support Vector Machine I Statistical Data Analysis with Positive Definite Kernels Kenji Fukumizu Institute of Statistical Mathematics, ROIS Department of Statistical Science, Graduate University for Advanced

More information

Duality. Geoff Gordon & Ryan Tibshirani Optimization /

Duality. Geoff Gordon & Ryan Tibshirani Optimization / Duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Duality in linear programs Suppose we want to find lower bound on the optimal value in our convex problem, B min x C f(x) E.g., consider

More information

Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012

Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology. M. Soleymani Fall 2012 Support Vector Machine (SVM) & Kernel CE-717: Machine Learning Sharif University of Technology M. Soleymani Fall 2012 Linear classifier Which classifier? x 2 x 1 2 Linear classifier Margin concept x 2

More information

Support Vector Machine (SVM) and Kernel Methods

Support Vector Machine (SVM) and Kernel Methods Support Vector Machine (SVM) and Kernel Methods CE-717: Machine Learning Sharif University of Technology Fall 2014 Soleymani Outline Margin concept Hard-Margin SVM Soft-Margin SVM Dual Problems of Hard-Margin

More information

CSC 411 Lecture 17: Support Vector Machine

CSC 411 Lecture 17: Support Vector Machine CSC 411 Lecture 17: Support Vector Machine Ethan Fetaya, James Lucas and Emad Andrews University of Toronto CSC411 Lec17 1 / 1 Today Max-margin classification SVM Hard SVM Duality Soft SVM CSC411 Lec17

More information

Tutorial on Convex Optimization: Part II

Tutorial on Convex Optimization: Part II Tutorial on Convex Optimization: Part II Dr. Khaled Ardah Communications Research Laboratory TU Ilmenau Dec. 18, 2018 Outline Convex Optimization Review Lagrangian Duality Applications Optimal Power Allocation

More information

SVMs, Duality and the Kernel Trick

SVMs, Duality and the Kernel Trick SVMs, Duality and the Kernel Trick Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University February 26 th, 2007 2005-2007 Carlos Guestrin 1 SVMs reminder 2005-2007 Carlos Guestrin 2 Today

More information

Support Vector Machines

Support Vector Machines Two SVM tutorials linked in class website (please, read both): High-level presentation with applications (Hearst 1998) Detailed tutorial (Burges 1998) Support Vector Machines Machine Learning 10701/15781

More information

Machine Learning A Geometric Approach

Machine Learning A Geometric Approach Machine Learning A Geometric Approach CIML book Chap 7.7 Linear Classification: Support Vector Machines (SVM) Professor Liang Huang some slides from Alex Smola (CMU) Linear Separator Ham Spam From Perceptron

More information

Lagrangian Duality. Richard Lusby. Department of Management Engineering Technical University of Denmark

Lagrangian Duality. Richard Lusby. Department of Management Engineering Technical University of Denmark Lagrangian Duality Richard Lusby Department of Management Engineering Technical University of Denmark Today s Topics (jg Lagrange Multipliers Lagrangian Relaxation Lagrangian Duality R Lusby (42111) Lagrangian

More information

Interior Point Algorithms for Constrained Convex Optimization

Interior Point Algorithms for Constrained Convex Optimization Interior Point Algorithms for Constrained Convex Optimization Chee Wei Tan CS 8292 : Advanced Topics in Convex Optimization and its Applications Fall 2010 Outline Inequality constrained minimization problems

More information

Support Vector Machines

Support Vector Machines EE 17/7AT: Optimization Models in Engineering Section 11/1 - April 014 Support Vector Machines Lecturer: Arturo Fernandez Scribe: Arturo Fernandez 1 Support Vector Machines Revisited 1.1 Strictly) Separable

More information

Nonlinear Optimization: What s important?

Nonlinear Optimization: What s important? Nonlinear Optimization: What s important? Julian Hall 10th May 2012 Convexity: convex problems A local minimizer is a global minimizer A solution of f (x) = 0 (stationary point) is a minimizer A global

More information

Duality Uses and Correspondences. Ryan Tibshirani Convex Optimization

Duality Uses and Correspondences. Ryan Tibshirani Convex Optimization Duality Uses and Correspondences Ryan Tibshirani Conve Optimization 10-725 Recall that for the problem Last time: KKT conditions subject to f() h i () 0, i = 1,... m l j () = 0, j = 1,... r the KKT conditions

More information

Introduction to Support Vector Machines

Introduction to Support Vector Machines Introduction to Support Vector Machines Shivani Agarwal Support Vector Machines (SVMs) Algorithm for learning linear classifiers Motivated by idea of maximizing margin Efficient extension to non-linear

More information

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006

Quiz Discussion. IE417: Nonlinear Programming: Lecture 12. Motivation. Why do we care? Jeff Linderoth. 16th March 2006 Quiz Discussion IE417: Nonlinear Programming: Lecture 12 Jeff Linderoth Department of Industrial and Systems Engineering Lehigh University 16th March 2006 Motivation Why do we care? We are interested in

More information

Convex Optimization M2

Convex Optimization M2 Convex Optimization M2 Lecture 8 A. d Aspremont. Convex Optimization M2. 1/57 Applications A. d Aspremont. Convex Optimization M2. 2/57 Outline Geometrical problems Approximation problems Combinatorial

More information

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44 Convex Optimization Newton s method ENSAE: Optimisation 1/44 Unconstrained minimization minimize f(x) f convex, twice continuously differentiable (hence dom f open) we assume optimal value p = inf x f(x)

More information

Lecture 10: A brief introduction to Support Vector Machine

Lecture 10: A brief introduction to Support Vector Machine Lecture 10: A brief introduction to Support Vector Machine Advanced Applied Multivariate Analysis STAT 2221, Fall 2013 Sungkyu Jung Department of Statistics, University of Pittsburgh Xingye Qiao Department

More information

Applied inductive learning - Lecture 7

Applied inductive learning - Lecture 7 Applied inductive learning - Lecture 7 Louis Wehenkel & Pierre Geurts Department of Electrical Engineering and Computer Science University of Liège Montefiore - Liège - November 5, 2012 Find slides: http://montefiore.ulg.ac.be/

More information

Max Margin-Classifier

Max Margin-Classifier Max Margin-Classifier Oliver Schulte - CMPT 726 Bishop PRML Ch. 7 Outline Maximum Margin Criterion Math Maximizing the Margin Non-Separable Data Kernels and Non-linear Mappings Where does the maximization

More information

Support Vector Machines for Regression

Support Vector Machines for Regression COMP-566 Rohan Shah (1) Support Vector Machines for Regression Provided with n training data points {(x 1, y 1 ), (x 2, y 2 ),, (x n, y n )} R s R we seek a function f for a fixed ɛ > 0 such that: f(x

More information

subject to (x 2)(x 4) u,

subject to (x 2)(x 4) u, Exercises Basic definitions 5.1 A simple example. Consider the optimization problem with variable x R. minimize x 2 + 1 subject to (x 2)(x 4) 0, (a) Analysis of primal problem. Give the feasible set, the

More information

WHY DUALITY? Gradient descent Newton s method Quasi-newton Conjugate gradients. No constraints. Non-differentiable ???? Constrained problems? ????

WHY DUALITY? Gradient descent Newton s method Quasi-newton Conjugate gradients. No constraints. Non-differentiable ???? Constrained problems? ???? DUALITY WHY DUALITY? No constraints f(x) Non-differentiable f(x) Gradient descent Newton s method Quasi-newton Conjugate gradients etc???? Constrained problems? f(x) subject to g(x) apple 0???? h(x) =0

More information

Machine Learning Support Vector Machines. Prof. Matteo Matteucci

Machine Learning Support Vector Machines. Prof. Matteo Matteucci Machine Learning Support Vector Machines Prof. Matteo Matteucci Discriminative vs. Generative Approaches 2 o Generative approach: we derived the classifier from some generative hypothesis about the way

More information

12. Interior-point methods

12. Interior-point methods 12. Interior-point methods Convex Optimization Boyd & Vandenberghe inequality constrained minimization logarithmic barrier function and central path barrier method feasibility and phase I methods complexity

More information

Dual Decomposition.

Dual Decomposition. 1/34 Dual Decomposition http://bicmr.pku.edu.cn/~wenzw/opt-2017-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/34 1 Conjugate function 2 introduction:

More information

CONVEX OPTIMIZATION, DUALITY, AND THEIR APPLICATION TO SUPPORT VECTOR MACHINES. Contents 1. Introduction 1 2. Convex Sets

CONVEX OPTIMIZATION, DUALITY, AND THEIR APPLICATION TO SUPPORT VECTOR MACHINES. Contents 1. Introduction 1 2. Convex Sets CONVEX OPTIMIZATION, DUALITY, AND THEIR APPLICATION TO SUPPORT VECTOR MACHINES DANIEL HENDRYCKS Abstract. This paper develops the fundamentals of convex optimization and applies them to Support Vector

More information

Convex Optimization Overview (cnt d)

Convex Optimization Overview (cnt d) Convex Optimization Overview (cnt d) Chuong B. Do October 6, 007 1 Recap During last week s section, we began our study of convex optimization, the study of mathematical optimization problems of the form,

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Instructor: Moritz Hardt Email: hardt+ee227c@berkeley.edu Graduate Instructor: Max Simchowitz Email: msimchow+ee227c@berkeley.edu

More information