MATH36061 Convex Optimization

Size: px
Start display at page:

Download "MATH36061 Convex Optimization"

Transcription

1 M\cr NA Manchester Numerical Analysis MATH36061 Convex Optimization Martin Lotz School of Mathematics The University of Manchester Manchester, September 26, 2017

2 Outline General information What is optimization? Course overview

3 Outline General information What is optimization? Course overview

4 Organisation The course website can be found under 1 / 27

5 Organisation The course website can be found under Tutorials start in week two, the tutorial class in week one will consist of a review of important concepts and Python. 1 / 27

6 Organisation The course website can be found under Tutorials start in week two, the tutorial class in week one will consist of a review of important concepts and Python. Problem sets consist of two parts: problems to be worked on at home, and problems to be discussed in class. 1 / 27

7 Organisation The course website can be found under Tutorials start in week two, the tutorial class in week one will consist of a review of important concepts and Python. Problem sets consist of two parts: problems to be worked on at home, and problems to be discussed in class. All the material presented in the lecture will be made available online as the course progresses. 1 / 27

8 Organisation The course website can be found under Tutorials start in week two, the tutorial class in week one will consist of a review of important concepts and Python. Problem sets consist of two parts: problems to be worked on at home, and problems to be discussed in class. All the material presented in the lecture will be made available online as the course progresses. Lecture notes will appear on the website each week. 1 / 27

9 Organisation The course website can be found under Tutorials start in week two, the tutorial class in week one will consist of a review of important concepts and Python. Problem sets consist of two parts: problems to be worked on at home, and problems to be discussed in class. All the material presented in the lecture will be made available online as the course progresses. Lecture notes will appear on the website each week. Contact: martin.lotz@manchester.ac.uk 1 / 27

10 Example classes Example classes are there to deepen the understanding of the course material. Problems sheets should ideally be looked at before the example class. 2 / 32

11 Example classes Example classes are there to deepen the understanding of the course material. Problems sheets should ideally be looked at before the example class. In the example classes, you will be given some time to work on the problems, you should ask questions if some parts are not clear, you will get feedback on your attempts at the problems, we will go through a selection of problems and their solutions. 2 / 32

12 Python 3 / 32

13 Python Python is an indispensable tool for this course. 3 / 32

14 Python Python is an indispensable tool for this course. Python is one of the most popular programming languages, used among others by Google, Dropbox, NASA, etc. 3 / 32

15 Python Python is an indispensable tool for this course. Python is one of the most popular programming languages, used among others by Google, Dropbox, NASA, etc. Many Data science jobs require knowledge of Python. 3 / 32

16 Python Python is an indispensable tool for this course. Python is one of the most popular programming languages, used among others by Google, Dropbox, NASA, etc. Many Data science jobs require knowledge of Python. Availability for this course: On SageMathCloud By installing Anaconda Python on your computer 3 / 32

17 Python Python is an indispensable tool for this course. Python is one of the most popular programming languages, used among others by Google, Dropbox, NASA, etc. Many Data science jobs require knowledge of Python. Availability for this course: On SageMathCloud By installing Anaconda Python on your computer The course page contains links to Python documentation 3 / 32

18 Python Example Python code for printing Fibonacci numbers below 25 def f ( a, b ) : c = a + b return c a, b = 0, 1 while b<25: a, b = b, f ( a, b ) p r i n t a 4 / 32

19 CVXPY 5 / 32

20 CVXPY CVXPY is a convex optimization package for Python. 5 / 32

21 CVXPY CVXPY is a convex optimization package for Python. CVXPY is freely available on 5 / 32

22 CVXPY CVXPY is a convex optimization package for Python. CVXPY is freely available on It is easy to install and the documentation provides some examples to get started. On SageMathCloud it is readily available. 5 / 32

23 CVXPY CVXPY is a convex optimization package for Python. CVXPY is freely available on It is easy to install and the documentation provides some examples to get started. On SageMathCloud it is readily available. An overview of Python, Jupyter notebooks, and CVXPY is given during the first tutorial class on Thursday! 5 / 32

24 Mathematical prerequisites The course will make extensive use of results from linear algebra (vectors and matrices in R n ), multivariate calculus (gradients, Hessians). A comprehensive overview of the mathematics needed can be found on the course website. 6 / 32

25 Outline General information What is optimization? Course overview

26 Mathematical optimization A mathematical optimization problem is a problem of the form minimize subject to f(x) x Ω where f is the objective function and Ω R n is a set of constraints. Often the set Ω is defined by those x R n that satisfy some conditions g 1 (x) 0,..., g m (x) 0, where the g i are real-valued functions. A vector x such that f(x ) f(x) for all other x satisfying the constraints is called a solution or minimizer of the problem. 7 / 32

27 Example: least squares Assume a quantity Y depends linearly on predictors X 1,..., X n : where ε is some random error. Y = β 0 + β 1 X β p X p + ε, (3.1) 8 / 32

28 Example: least squares Assume a quantity Y depends linearly on predictors X 1,..., X n : where ε is some random error. Y = β 0 + β 1 X β p X p + ε, (3.1) Example: Y could sales, and X 1,..., X p advertisement budget in different media (TV, internet, radio, billboards, newspapers). 8 / 32

29 Example: least squares Assume a quantity Y depends linearly on predictors X 1,..., X n : where ε is some random error. Y = β 0 + β 1 X β p X p + ε, (3.1) Example: Y could sales, and X 1,..., X p advertisement budget in different media (TV, internet, radio, billboards, newspapers). Goal: determine model parameters β 0,..., β p. 8 / 32

30 Example: least squares Assume a quantity Y depends linearly on predictors X 1,..., X n : where ε is some random error. Y = β 0 + β 1 X β p X p + ε, (3.1) Example: Y could sales, and X 1,..., X p advertisement budget in different media (TV, internet, radio, billboards, newspapers). Goal: determine model parameters β 0,..., β p. What we know is data from n p observations or experiments: y i = β 0 + β 1 x i1 + + β p x ip + ε i, 1 i n. 8 / 32

31 Example: least squares What we know is data from n p observations or experiments: y i = β 0 + β 1 x i1 + + β p x ip + ε i, 1 i n. 9 / 32

32 Example: least squares What we know is data from n p observations or experiments: y i = β 0 + β 1 x i1 + + β p x ip + ε i, 1 i n. Collect the data in matrices and vectors: y 1 y n y =., X = 1 x 11 x 1p x n1 x np, β = β 0 β 1. β p ε 1, ε =. ε n 9 / 32

33 Example: least squares What we know is data from n p observations or experiments: y i = β 0 + β 1 x i1 + + β p x ip + ε i, 1 i n. Collect the data in matrices and vectors: y 1 y n y =., X = 1 x 11 x 1p x n1 x np The relationship can then be written as y = Xβ + ε., β = β 0 β 1. β p ε 1, ε =. ε n 9 / 32

34 Example: least squares Based on the relationship y = Xβ + ε, choose β that minimizes the squared 2-norm of the error, minimize y Xβ / 32

35 Example: least squares Based on the relationship y = Xβ + ε, choose β that minimizes the squared 2-norm of the error, minimize y Xβ 2 2 This is an optimization problem with a quadratic objective function and no constraints. 10 / 32

36 Example: least squares Based on the relationship y = Xβ + ε, choose β that minimizes the squared 2-norm of the error, minimize y Xβ 2 2 This is an optimization problem with a quadratic objective function and no constraints. The problem has a closed form solution β = (X X) 1 X y. 10 / 32

37 Example: least squares Based on the relationship y = Xβ + ε, choose β that minimizes the squared 2-norm of the error, minimize y Xβ 2 2 This is an optimization problem with a quadratic objective function and no constraints. The problem has a closed form solution β = (X X) 1 X y. There are more efficient methods available than using the closed form solution. 10 / 32

38 Application: linear regression Observation: the log of the basal metabolic rate (energy expenditure) rate Y in mammals is related to log adult mass X as Y = β 0 + β 1 X. 11 / 32

39 Application: linear regression Observation: the log of the basal metabolic rate (energy expenditure) rate Y in mammals is related to log adult mass X as Y = β 0 + β 1 X. Collect data from a database 573 mammal species and plot the relationship. 11 / 32

40 Application: linear regression Observation: the log of the basal metabolic rate (energy expenditure) rate Y in mammals is related to log adult mass X as Y = β 0 + β 1 X. Collect data from a database 573 mammal species and plot the relationship. Assemble the vector y, matrix X, and solve the problem minimize y Xβ 2 11 / 32

41 Application: linear regression # C r e a t e a p+1 v e c t o r o f v a r i a b l e s Beta = V a r i a b l e ( p+1) # C r e a t e sum of s q u a r e s o b j e c t i v e f u n c t i o n o b j e c t i v e=minimize ( s u m e n t r i e s ( s q u a r e (X Beta y ) ) ) # C r e a t e problem and s o l v e i t prob = Problem ( o b j e c t i v e ) prob. s o l v e ( ) p l t. p l o t ( bmr [ :, 0 ], bmr [ :, 1 ], o ) xx = np. l i n s p a c e ( 0, 1 4, ) bmr = p l t. p l o t ( xx, Beta [ 0 ]. v a l u e+beta [ 1 ]. v a l u e xx, \ c o l o r= red, l i n e w i d t h =2) p l t. show ( ) 12 / 32

42 Application: linear regression Y = β 0 + β 1 X log Basal metabolic rate log Adult body mass 13 / 32

43 Application: linear regression Y = β0 + β1 X. Figure: Episode Size Matters from TV series Wonders of Life 13 / 32

44 Example: linear programming Linear programming refers to problems of the form maximize c 1 x c n x n subject to a 11 x a 1n x n b 1 a m1 x a mn x n b m x 1 0,..., x n / 32

45 Example: linear programming Linear programming refers to problems of the form maximize c 1 x c n x n subject to a 11 x a 1n x n b 1 a m1 x a mn x n b m x 1 0,..., x n 0. Can be rewritten in standard form shown at the beginning! 14 / 32

46 Example: linear programming Linear programming refers to problems of the form maximize c 1 x c n x n subject to a 11 x a 1n x n b 1 a m1 x a mn x n b m x 1 0,..., x n 0. Can be rewritten in standard form shown at the beginning! Using matrices and vectors: maximize c, x (= c x) subject to Ax b (inequality componentwise) x 0 14 / 32

47 Linear programming: geometry {x : a, x b}. 15 / 32

48 Linear programming: geometry {x : a 1, x b 1,..., a m, x b m } = {x : Ax b}. Polyhedron 16 / 32

49 Linear programming: geometry maximize c, x subject to Ax b. 17 / 32

50 Linear programming: application A cargo plane has two compartments with capacities C 1 = 35 and C 2 = 40 tonnes and volumes V 1 = 250 and V 2 = 400 cubic metres. 18 / 32

51 Linear programming: application A cargo plane has two compartments with capacities C 1 = 35 and C 2 = 40 tonnes and volumes V 1 = 250 and V 2 = 400 cubic metres. Three types of cargo: Volume Weight Profit ( / tonne) Cargo Cargo Cargo / 32

52 Linear programming: application A cargo plane has two compartments with capacities C 1 = 35 and C 2 = 40 tonnes and volumes V 1 = 250 and V 2 = 400 cubic metres. Three types of cargo: Volume Weight Profit ( / tonne) Cargo Cargo Cargo How to distribute the cargo to maximize profit? 18 / 32

53 Linear programming: application Objective function: x ij : amount of cargo i in compartment j f(x) = 300 (x 11 + x 12 ) (x 21 + x 22 ) (x 31 + x 32 ). 19 / 32

54 Linear programming: application x ij : amount of cargo i in compartment j Objective function: f(x) = 300 (x 11 + x 12 ) (x 21 + x 22 ) (x 31 + x 32 ). Constraints: x 11 + x (total amount of cargo 1) x 21 + x (total amount of cargo 2) x 31 + x (total amount of cargo 3) x 11 + x 21 + x (weight constraint on comp. 1) x 12 + x 22 + x (weight constraint on comp. 2) 8x x x (volume constraint on comp. 1) 8x x x (volume constraint on comp. 2) (x 11 + x 21 + x 31 )/35 (x 12 + x 22 + x 32 )/40 = 0 (maintain balance of weight ratio) x ij 0 (cargo can t have negative weight) 19 / 32

55 Linear programming: application As complicated as it looks, it can be solved easily with CVXPY. # D e f i n e a l l t h e m a t r i c e s and v e c t o r s i n v o l v e d c = np. a r r a y ( [ 3 0 0, 3 0 0, 3 5 0, 3 5 0, 2 7 0, ] ) A = np. a r r a y ( [ [ 1, 1, 0, 0, 0, 0 ], [ 0, 0, 1, 1, 0, 0 ], [ 0, 0, 0, 0, 1, 1 ], [ 1, 0, 1, 0, 1, 0 ], [ 0, 1, 0, 1, 0, 1 ], [ 8, 0, 10, 0, 7, 0 ], [ 0, 8, 0, 10, 0, 7 ] ] ) b = np. a r r a y ( [ 2 5, 3 2, 2 8, 3 5, 4 0, 2 5 0, ] ) ; B = np. array ( [ 1 / 35, 1/40, 1/35, 1/40, 1/35, 1/40]); d = np. z e r o s ( 1 ) # C r e a t e v a r i a b l e s, o b j e c t i v e and c o n s t r a i n t s x = V a r i a b l e ( 6 ) c o n s t r a i n t s = [ A x <= b, B x == d, x >= 0 ] o b j e c t i v e = Maximize ( c x ) # C r e a t e a problem u s i n g t h e o b j e c t i v e and c o n s t r a i n t s and s o l v e i t prob = Problem ( o b j e c t i v e, c o n s t r a i n t s ) prob. s o l v e ( ) 20 / 32

56 Linear programming: application As complicated as it looks, it can be solved easily with CVXPY. # D e f i n e a l l t h e m a t r i c e s and v e c t o r s i n v o l v e d c = np. a r r a y ( [ 3 0 0, 3 0 0, 3 5 0, 3 5 0, 2 7 0, ] ) A = np. a r r a y ( [ [ 1, 1, 0, 0, 0, 0 ], [ 0, 0, 1, 1, 0, 0 ], [ 0, 0, 0, 0, 1, 1 ], [ 1, 0, 1, 0, 1, 0 ], [ 0, 1, 0, 1, 0, 1 ], [ 8, 0, 10, 0, 7, 0 ], [ 0, 8, 0, 10, 0, 7 ] ] ) b = np. a r r a y ( [ 2 5, 3 2, 2 8, 3 5, 4 0, 2 5 0, ] ) ; B = np. array ( [ 1 / 35, 1/40, 1/35, 1/40, 1/35, 1/40]); d = np. z e r o s ( 1 ) # C r e a t e v a r i a b l e s, o b j e c t i v e and c o n s t r a i n t s x = V a r i a b l e ( 6 ) c o n s t r a i n t s = [ A x <= b, B x == d, x >= 0 ] o b j e c t i v e = Maximize ( c x ) # C r e a t e a problem u s i n g t h e o b j e c t i v e and c o n s t r a i n t s and s o l v e i t prob = Problem ( o b j e c t i v e, c o n s t r a i n t s ) prob. s o l v e ( ) x 11 = , x 12 = , x 21 = 0, x 22 = 32, x 31 = 28, x 32 = / 32

57 Example: total variation inpainting Optimization methods play an increasingly important role in image and signal processing. 21 / 32

58 Example: total variation inpainting Optimization methods play an increasingly important role in image and signal processing. An image is an m n matrix U, with each entry u ij corresponding to a light intensity or colour. 21 / 32

59 Example: total variation inpainting Optimization methods play an increasingly important role in image and signal processing. An image is an m n matrix U, with each entry u ij corresponding to a light intensity or colour. In the image inpainting problem, one aims to guess the true value of missing or corrupted entries of an image. 21 / 32

60 Example: total variation inpainting Optimization methods play an increasingly important role in image and signal processing. An image is an m n matrix U, with each entry u ij corresponding to a light intensity or colour. In the image inpainting problem, one aims to guess the true value of missing or corrupted entries of an image. A conceptually simple approach is to replace the image with the closest image among a set of images satisfying typical properties. 21 / 32

61 Example: total variation inpainting What are typical properties of a typical image? Images tend to have large homogeneous areas in which the colour doesn t change much; Images have approximately low rank, when interpreted as matrices. The total variation or TV-norm is the sum of the norm of the horizontal and vertical differences, U TV = m n i=1 j=1 (u i+1,j u i,j ) 2 + (u i,j+1 u i,j ) 2, 22 / 32

62 Example: total variation inpainting The TV-norm naturally increases with increased variation or sharp edges in an image U 1 = , U The left matrix has TV-norm U 1 TV = , while the right one has TV-norm U 2 TV = Intuitively, we would expect a natural image with artifacts added to it to have a higher TV norm. 23 / 32

63 Example: total variation inpainting Consider the following corrupted image. 24 / 32

64 Example: total variation inpainting Applying the optimization problem minimize X TV subject to x ij = u ij, (i, j) Ω. we recover an approximation to the true image: Inpainted Image Difference Image 25 / 32

65 Example: machine learning In machine learning, the goal is to automatically learn a function f(x) = y from a training set of pairs (x i, y i ). 26 / 32

66 Example: machine learning In machine learning, the goal is to automatically learn a function f(x) = y from a training set of pairs (x i, y i ). The problem can be posed as an optimization problem min f F m (f(x i ) y i ) 2, i=1 where F is a class of functions. Examples include regression and classification. 26 / 32

67 Application: letter recognition By solving an optimization problem (a multi-layer neural network), the computer learns a function that assigns to an image the closest handwritten letter. 27 / 32

68 Applications of machine learning Handwritten character recognition; Classification of gene expression data; Computer games; Online recommender systems; Sentiment analysis, text mining; Medical diagnosis; Computer vision;... Developments driven by Google DeepMind, Facebook AI, OpenAI, etc. 28 / 32

69 Outline General information What is optimization? Course overview

70 Overview of the course The course will try to balance three points of view: Theory. Optimality conditions, duality theory, convex analysis, geometry of linear, quadratic and semidefinite programming. Algorithms. Descent algorithms, interior-point methods, projected gradient methods, Lagrangian Applications. Logistics and scheduling, finance, signal processing, convex regularization, semidefinite relaxations of hard combinatorial problems, machine learning, compressive sensing. 29 / 32

71 The course will be... Rigorous. We prove Theorems and the occasional Lemma. Computational. We implement examples and see optimization at work. Visual. Convex optimization is a geometric theory, so expect to see lots of pictures. 30 / 32

72 Overview of the course Next class (Thursday 9-11, Univ. Place 6.212): Unconstrained optimization: gradient descent; Introduction to the computational tools of this course. 31 / 32

73 See you next time! Caged parrot Free parrot minimize X TV subject to x ij = u ij, (i, j) Ω. 32 / 32

Math for Machine Learning Open Doors to Data Science and Artificial Intelligence. Richard Han

Math for Machine Learning Open Doors to Data Science and Artificial Intelligence. Richard Han Math for Machine Learning Open Doors to Data Science and Artificial Intelligence Richard Han Copyright 05 Richard Han All rights reserved. CONTENTS PREFACE... - INTRODUCTION... LINEAR REGRESSION... 4 LINEAR

More information

Two hours. To be provided by Examinations Office: Mathematical Formula Tables. THE UNIVERSITY OF MANCHESTER. xx xxxx 2017 xx:xx xx.

Two hours. To be provided by Examinations Office: Mathematical Formula Tables. THE UNIVERSITY OF MANCHESTER. xx xxxx 2017 xx:xx xx. Two hours To be provided by Examinations Office: Mathematical Formula Tables. THE UNIVERSITY OF MANCHESTER CONVEX OPTIMIZATION - SOLUTIONS xx xxxx 27 xx:xx xx.xx Answer THREE of the FOUR questions. If

More information

Support Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM

Support Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM 1 Support Vector Machines (SVM) in bioinformatics Day 1: Introduction to SVM Jean-Philippe Vert Bioinformatics Center, Kyoto University, Japan Jean-Philippe.Vert@mines.org Human Genome Center, University

More information

Overview. Linear Regression with One Variable Gradient Descent Linear Regression with Multiple Variables Gradient Descent with Multiple Variables

Overview. Linear Regression with One Variable Gradient Descent Linear Regression with Multiple Variables Gradient Descent with Multiple Variables Overview Linear Regression with One Variable Gradient Descent Linear Regression with Multiple Variables Gradient Descent with Multiple Variables Example: Advertising Data Data taken from An Introduction

More information

MATH20602 Numerical Analysis 1

MATH20602 Numerical Analysis 1 M\cr NA Manchester Numerical Analysis MATH20602 Numerical Analysis 1 Martin Lotz School of Mathematics The University of Manchester Manchester, February 1, 2016 Outline General Course Information Introduction

More information

Lecture Unconstrained optimization. In this lecture we will study the unconstrained problem. minimize f(x), (2.1)

Lecture Unconstrained optimization. In this lecture we will study the unconstrained problem. minimize f(x), (2.1) Lecture 2 In this lecture we will study the unconstrained problem minimize f(x), (2.1) where x R n. Optimality conditions aim to identify properties that potential minimizers need to satisfy in relation

More information

Machine Learning for Physicists Lecture 1

Machine Learning for Physicists Lecture 1 Machine Learning for Physicists Lecture 1 Summer 2017 University of Erlangen-Nuremberg Florian Marquardt (Image generated by a net with 20 hidden layers) OUTPUT INPUT (Picture: Wikimedia Commons) OUTPUT

More information

CS295: Convex Optimization. Xiaohui Xie Department of Computer Science University of California, Irvine

CS295: Convex Optimization. Xiaohui Xie Department of Computer Science University of California, Irvine CS295: Convex Optimization Xiaohui Xie Department of Computer Science University of California, Irvine Course information Prerequisites: multivariate calculus and linear algebra Textbook: Convex Optimization

More information

CPSC 340: Machine Learning and Data Mining

CPSC 340: Machine Learning and Data Mining CPSC 340: Machine Learning and Data Mining Linear Classifiers: predictions Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due Friday of next

More information

Scientific Computing: Optimization

Scientific Computing: Optimization Scientific Computing: Optimization Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course MATH-GA.2043 or CSCI-GA.2112, Spring 2012 March 8th, 2011 A. Donev (Courant Institute) Lecture

More information

Mustafa Jarrar. Machine Learning Linear Regression. Birzeit University. Mustafa Jarrar: Lecture Notes on Linear Regression Birzeit University, 2018

Mustafa Jarrar. Machine Learning Linear Regression. Birzeit University. Mustafa Jarrar: Lecture Notes on Linear Regression Birzeit University, 2018 Mustafa Jarrar: Lecture Notes on Linear Regression Birzeit University, 2018 Version 1 Machine Learning Linear Regression Mustafa Jarrar Birzeit University 1 Watch this lecture and download the slides Course

More information

LECTURE 03: LINEAR REGRESSION PT. 1. September 18, 2017 SDS 293: Machine Learning

LECTURE 03: LINEAR REGRESSION PT. 1. September 18, 2017 SDS 293: Machine Learning LECTURE 03: LINEAR REGRESSION PT. 1 September 18, 2017 SDS 293: Machine Learning Announcements Need help with? Visit the Stats TAs! Sunday Thursday evenings 7 9 pm in Burton 301 (SDS293 alum available

More information

CSC321 Lecture 2: Linear Regression

CSC321 Lecture 2: Linear Regression CSC32 Lecture 2: Linear Regression Roger Grosse Roger Grosse CSC32 Lecture 2: Linear Regression / 26 Overview First learning algorithm of the course: linear regression Task: predict scalar-valued targets,

More information

CS 6375 Machine Learning

CS 6375 Machine Learning CS 6375 Machine Learning Nicholas Ruozzi University of Texas at Dallas Slides adapted from David Sontag and Vibhav Gogate Course Info. Instructor: Nicholas Ruozzi Office: ECSS 3.409 Office hours: Tues.

More information

MATH20602 Numerical Analysis 1

MATH20602 Numerical Analysis 1 M\cr NA Manchester Numerical Analysis MATH20602 Numerical Analysis 1 Martin Lotz School of Mathematics The University of Manchester Manchester, January 27, 2014 Outline General Course Information Introduction

More information

Machine Learning Support Vector Machines. Prof. Matteo Matteucci

Machine Learning Support Vector Machines. Prof. Matteo Matteucci Machine Learning Support Vector Machines Prof. Matteo Matteucci Discriminative vs. Generative Approaches 2 o Generative approach: we derived the classifier from some generative hypothesis about the way

More information

Lecture 9: Large Margin Classifiers. Linear Support Vector Machines

Lecture 9: Large Margin Classifiers. Linear Support Vector Machines Lecture 9: Large Margin Classifiers. Linear Support Vector Machines Perceptrons Definition Perceptron learning rule Convergence Margin & max margin classifiers (Linear) support vector machines Formulation

More information

Convex Optimization M2

Convex Optimization M2 Convex Optimization M2 Lecture 8 A. d Aspremont. Convex Optimization M2. 1/57 Applications A. d Aspremont. Convex Optimization M2. 2/57 Outline Geometrical problems Approximation problems Combinatorial

More information

Lecture 2: Linear regression

Lecture 2: Linear regression Lecture 2: Linear regression Roger Grosse 1 Introduction Let s ump right in and look at our first machine learning algorithm, linear regression. In regression, we are interested in predicting a scalar-valued

More information

Homework 4. Convex Optimization /36-725

Homework 4. Convex Optimization /36-725 Homework 4 Convex Optimization 10-725/36-725 Due Friday November 4 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)

More information

Machine Learning And Applications: Supervised Learning-SVM

Machine Learning And Applications: Supervised Learning-SVM Machine Learning And Applications: Supervised Learning-SVM Raphaël Bournhonesque École Normale Supérieure de Lyon, Lyon, France raphael.bournhonesque@ens-lyon.fr 1 Supervised vs unsupervised learning Machine

More information

Regression with Numerical Optimization. Logistic

Regression with Numerical Optimization. Logistic CSG220 Machine Learning Fall 2008 Regression with Numerical Optimization. Logistic regression Regression with Numerical Optimization. Logistic regression based on a document by Andrew Ng October 3, 204

More information

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University

DS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 17 2019 Logistics HW 1 is on Piazza and Gradescope Deadline: Friday, Jan. 25, 2019 Office

More information

Probabilistic Machine Learning. Industrial AI Lab.

Probabilistic Machine Learning. Industrial AI Lab. Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear

More information

Optimeringslära för F (SF1811) / Optimization (SF1841)

Optimeringslära för F (SF1811) / Optimization (SF1841) Optimeringslära för F (SF1811) / Optimization (SF1841) 1. Information about the course 2. Examples of optimization problems 3. Introduction to linear programming Introduction - Per Enqvist 1 Linear programming

More information

Learning by constraints and SVMs (2)

Learning by constraints and SVMs (2) Statistical Techniques in Robotics (16-831, F12) Lecture#14 (Wednesday ctober 17) Learning by constraints and SVMs (2) Lecturer: Drew Bagnell Scribe: Albert Wu 1 1 Support Vector Ranking Machine pening

More information

Support Vector Machines: Maximum Margin Classifiers

Support Vector Machines: Maximum Margin Classifiers Support Vector Machines: Maximum Margin Classifiers Machine Learning and Pattern Recognition: September 16, 2008 Piotr Mirowski Based on slides by Sumit Chopra and Fu-Jie Huang 1 Outline What is behind

More information

Linear Regression. S. Sumitra

Linear Regression. S. Sumitra Linear Regression S Sumitra Notations: x i : ith data point; x T : transpose of x; x ij : ith data point s jth attribute Let {(x 1, y 1 ), (x, y )(x N, y N )} be the given data, x i D and y i Y Here D

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

COMP 652: Machine Learning. Lecture 12. COMP Lecture 12 1 / 37

COMP 652: Machine Learning. Lecture 12. COMP Lecture 12 1 / 37 COMP 652: Machine Learning Lecture 12 COMP 652 Lecture 12 1 / 37 Today Perceptrons Definition Perceptron learning rule Convergence (Linear) support vector machines Margin & max margin classifier Formulation

More information

Linear Regression. CSL603 - Fall 2017 Narayanan C Krishnan

Linear Regression. CSL603 - Fall 2017 Narayanan C Krishnan Linear Regression CSL603 - Fall 2017 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Univariate regression Multivariate regression Probabilistic view of regression Loss functions Bias-Variance analysis Regularization

More information

Structured matrix factorizations. Example: Eigenfaces

Structured matrix factorizations. Example: Eigenfaces Structured matrix factorizations Example: Eigenfaces An extremely large variety of interesting and important problems in machine learning can be formulated as: Given a matrix, find a matrix and a matrix

More information

Linear Regression. CSL465/603 - Fall 2016 Narayanan C Krishnan

Linear Regression. CSL465/603 - Fall 2016 Narayanan C Krishnan Linear Regression CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Univariate regression Multivariate regression Probabilistic view of regression Loss functions Bias-Variance analysis

More information

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship

More information

Online Videos FERPA. Sign waiver or sit on the sides or in the back. Off camera question time before and after lecture. Questions?

Online Videos FERPA. Sign waiver or sit on the sides or in the back. Off camera question time before and after lecture. Questions? Online Videos FERPA Sign waiver or sit on the sides or in the back Off camera question time before and after lecture Questions? Lecture 1, Slide 1 CS224d Deep NLP Lecture 4: Word Window Classification

More information

Compressed Sensing and Neural Networks

Compressed Sensing and Neural Networks and Jan Vybíral (Charles University & Czech Technical University Prague, Czech Republic) NOMAD Summer Berlin, September 25-29, 2017 1 / 31 Outline Lasso & Introduction Notation Training the network Applications

More information

Review of Optimization Methods

Review of Optimization Methods Review of Optimization Methods Prof. Manuela Pedio 20550 Quantitative Methods for Finance August 2018 Outline of the Course Lectures 1 and 2 (3 hours, in class): Linear and non-linear functions on Limits,

More information

HOMEWORK #4: LOGISTIC REGRESSION

HOMEWORK #4: LOGISTIC REGRESSION HOMEWORK #4: LOGISTIC REGRESSION Probabilistic Learning: Theory and Algorithms CS 274A, Winter 2019 Due: 11am Monday, February 25th, 2019 Submit scan of plots/written responses to Gradebook; submit your

More information

Training the linear classifier

Training the linear classifier 215, Training the linear classifier A natural way to train the classifier is to minimize the number of classification errors on the training data, i.e. choosing w so that the training error is minimized.

More information

Homework 5. Convex Optimization /36-725

Homework 5. Convex Optimization /36-725 Homework 5 Convex Optimization 10-725/36-725 Due Tuesday November 22 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)

More information

Machine Learning (CS 567) Lecture 5

Machine Learning (CS 567) Lecture 5 Machine Learning (CS 567) Lecture 5 Time: T-Th 5:00pm - 6:20pm Location: GFS 118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol

More information

IE598 Big Data Optimization Introduction

IE598 Big Data Optimization Introduction IE598 Big Data Optimization Introduction Instructor: Niao He Jan 17, 2018 1 A little about me Assistant Professor, ISE & CSL UIUC, 2016 Ph.D. in Operations Research, M.S. in Computational Sci. & Eng. Georgia

More information

Mini-project in scientific computing

Mini-project in scientific computing Mini-project in scientific computing Eran Treister Computer Science Department, Ben-Gurion University of the Negev, Israel. March 7, 2018 1 / 30 Scientific computing Involves the solution of large computational

More information

Support Vector Machines

Support Vector Machines Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized

More information

Neural Networks and Deep Learning

Neural Networks and Deep Learning Neural Networks and Deep Learning Professor Ameet Talwalkar November 12, 2015 Professor Ameet Talwalkar Neural Networks and Deep Learning November 12, 2015 1 / 16 Outline 1 Review of last lecture AdaBoost

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

Machine Learning and Data Mining. Linear classification. Kalev Kask

Machine Learning and Data Mining. Linear classification. Kalev Kask Machine Learning and Data Mining Linear classification Kalev Kask Supervised learning Notation Features x Targets y Predictions ŷ = f(x ; q) Parameters q Program ( Learner ) Learning algorithm Change q

More information

Foundation of Intelligent Systems, Part I. SVM s & Kernel Methods

Foundation of Intelligent Systems, Part I. SVM s & Kernel Methods Foundation of Intelligent Systems, Part I SVM s & Kernel Methods mcuturi@i.kyoto-u.ac.jp FIS - 2013 1 Support Vector Machines The linearly-separable case FIS - 2013 2 A criterion to select a linear classifier:

More information

Machine Learning Basics Lecture 4: SVM I. Princeton University COS 495 Instructor: Yingyu Liang

Machine Learning Basics Lecture 4: SVM I. Princeton University COS 495 Instructor: Yingyu Liang Machine Learning Basics Lecture 4: SVM I Princeton University COS 495 Instructor: Yingyu Liang Review: machine learning basics Math formulation Given training data x i, y i : 1 i n i.i.d. from distribution

More information

Least Mean Squares Regression

Least Mean Squares Regression Least Mean Squares Regression Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Lecture Overview Linear classifiers What functions do linear classifiers express? Least Squares Method

More information

Welcome to CPSC 4850/ OR Algorithms

Welcome to CPSC 4850/ OR Algorithms Welcome to CPSC 4850/5850 - OR Algorithms 1 Course Outline 2 Operations Research definition 3 Modeling Problems Product mix Transportation 4 Using mathematical programming Course Outline Instructor: Robert

More information

Data Science - Convex optimization and application

Data Science - Convex optimization and application 1 Data Science - Convex optimization and application Data Science - Convex optimization and application Summary We begin by some illustrations in challenging topics in modern data science. Then, this session

More information

Welcome to the Machine Learning Practical Deep Neural Networks. MLP Lecture 1 / 18 September 2018 Single Layer Networks (1) 1

Welcome to the Machine Learning Practical Deep Neural Networks. MLP Lecture 1 / 18 September 2018 Single Layer Networks (1) 1 Welcome to the Machine Learning Practical Deep Neural Networks MLP Lecture 1 / 18 September 2018 Single Layer Networks (1) 1 Introduction to MLP; Single Layer Networks (1) Steve Renals Machine Learning

More information

ECE521 W17 Tutorial 1. Renjie Liao & Min Bai

ECE521 W17 Tutorial 1. Renjie Liao & Min Bai ECE521 W17 Tutorial 1 Renjie Liao & Min Bai Schedule Linear Algebra Review Matrices, vectors Basic operations Introduction to TensorFlow NumPy Computational Graphs Basic Examples Linear Algebra Review

More information

Lecture 9: September 28

Lecture 9: September 28 0-725/36-725: Convex Optimization Fall 206 Lecturer: Ryan Tibshirani Lecture 9: September 28 Scribes: Yiming Wu, Ye Yuan, Zhihao Li Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These

More information

Statistical NLP for the Web

Statistical NLP for the Web Statistical NLP for the Web Neural Networks, Deep Belief Networks Sameer Maskey Week 8, October 24, 2012 *some slides from Andrew Rosenberg Announcements Please ask HW2 related questions in courseworks

More information

Machine Learning Brett Bernstein. Recitation 1: Gradients and Directional Derivatives

Machine Learning Brett Bernstein. Recitation 1: Gradients and Directional Derivatives Machine Learning Brett Bernstein Recitation 1: Gradients and Directional Derivatives Intro Question 1 We are given the data set (x 1, y 1 ),, (x n, y n ) where x i R d and y i R We want to fit a linear

More information

Algebra 1 Bassam Raychouni Grade 8 Math. Greenwood International School Course Description. Course Title: Head of Department:

Algebra 1 Bassam Raychouni Grade 8 Math. Greenwood International School Course Description. Course Title: Head of Department: Course Title: Head of Department: Bassam Raychouni (bassam@greenwood.sh.ae) Teacher(s) + e-mail: Femi Antony (femi.a@greenwood.sch.ae) Cycle/Division: High School Grade Level: 9 or 10 Credit Unit: 1 Duration:

More information

CS145: INTRODUCTION TO DATA MINING

CS145: INTRODUCTION TO DATA MINING CS145: INTRODUCTION TO DATA MINING 5: Vector Data: Support Vector Machine Instructor: Yizhou Sun yzsun@cs.ucla.edu October 18, 2017 Homework 1 Announcements Due end of the day of this Thursday (11:59pm)

More information

Deep Learning for Computer Vision

Deep Learning for Computer Vision Deep Learning for Computer Vision Lecture 4: Curse of Dimensionality, High Dimensional Feature Spaces, Linear Classifiers, Linear Regression, Python, and Jupyter Notebooks Peter Belhumeur Computer Science

More information

CSC321 Lecture 4: Learning a Classifier

CSC321 Lecture 4: Learning a Classifier CSC321 Lecture 4: Learning a Classifier Roger Grosse Roger Grosse CSC321 Lecture 4: Learning a Classifier 1 / 28 Overview Last time: binary classification, perceptron algorithm Limitations of the perceptron

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning Elman Mansimov 1 September 24, 2015 1 Modified based on Shenlong Wang s and Jake Snell s tutorials, with additional contents borrowed from Kevin Swersky and Jasper Snoek

More information

ML4NLP Multiclass Classification

ML4NLP Multiclass Classification ML4NLP Multiclass Classification CS 590NLP Dan Goldwasser Purdue University dgoldwas@purdue.edu Social NLP Last week we discussed the speed-dates paper. Interesting perspective on NLP problems- Can we

More information

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann

Neural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann Neural Networks with Applications to Vision and Language Feedforward Networks Marco Kuhlmann Feedforward networks Linear separability x 2 x 2 0 1 0 1 0 0 x 1 1 0 x 1 linearly separable not linearly separable

More information

We choose parameter values that will minimize the difference between the model outputs & the true function values.

We choose parameter values that will minimize the difference between the model outputs & the true function values. CSE 4502/5717 Big Data Analytics Lecture #16, 4/2/2018 with Dr Sanguthevar Rajasekaran Notes from Yenhsiang Lai Machine learning is the task of inferring a function, eg, f : R " R This inference has to

More information

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3 MI 6.97 Algebraic techniques and semidefinite optimization February 4, 6 Lecture 3 Lecturer: Pablo A. Parrilo Scribe: Pablo A. Parrilo In this lecture, we will discuss one of the most important applications

More information

Linear Regression (continued)

Linear Regression (continued) Linear Regression (continued) Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 6, 2017 1 / 39 Outline 1 Administration 2 Review of last lecture 3 Linear regression

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning First-Order Methods, L1-Regularization, Coordinate Descent Winter 2016 Some images from this lecture are taken from Google Image Search. Admin Room: We ll count final numbers

More information

Convex Optimization / Homework 1, due September 19

Convex Optimization / Homework 1, due September 19 Convex Optimization 1-725/36-725 Homework 1, due September 19 Instructions: You must complete Problems 1 3 and either Problem 4 or Problem 5 (your choice between the two). When you submit the homework,

More information

CSE 250a. Assignment Noisy-OR model. Out: Tue Oct 26 Due: Tue Nov 2

CSE 250a. Assignment Noisy-OR model. Out: Tue Oct 26 Due: Tue Nov 2 CSE 250a. Assignment 4 Out: Tue Oct 26 Due: Tue Nov 2 4.1 Noisy-OR model X 1 X 2 X 3... X d Y For the belief network of binary random variables shown above, consider the noisy-or conditional probability

More information

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 18

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 18 EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 18 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory May 31, 2012 Andre Tkacenko

More information

A Quick Tour of Linear Algebra and Optimization for Machine Learning

A Quick Tour of Linear Algebra and Optimization for Machine Learning A Quick Tour of Linear Algebra and Optimization for Machine Learning Masoud Farivar January 8, 2015 1 / 28 Outline of Part I: Review of Basic Linear Algebra Matrices and Vectors Matrix Multiplication Operators

More information

Introduction to Machine Learning. Regression. Computer Science, Tel-Aviv University,

Introduction to Machine Learning. Regression. Computer Science, Tel-Aviv University, 1 Introduction to Machine Learning Regression Computer Science, Tel-Aviv University, 2013-14 Classification Input: X Real valued, vectors over real. Discrete values (0,1,2,...) Other structures (e.g.,

More information

Linear Models in Machine Learning

Linear Models in Machine Learning CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

CSC 411 Lecture 17: Support Vector Machine

CSC 411 Lecture 17: Support Vector Machine CSC 411 Lecture 17: Support Vector Machine Ethan Fetaya, James Lucas and Emad Andrews University of Toronto CSC411 Lec17 1 / 1 Today Max-margin classification SVM Hard SVM Duality Soft SVM CSC411 Lec17

More information

SVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning

SVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning SVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning Mark Schmidt University of British Columbia, May 2016 www.cs.ubc.ca/~schmidtm/svan16 Some images from this lecture are

More information

EE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015

EE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015 EE613 Machine Learning for Engineers Kernel methods Support Vector Machines jean-marc odobez 2015 overview Kernel methods introductions and main elements defining kernels Kernelization of k-nn, K-Means,

More information

g(x,y) = c. For instance (see Figure 1 on the right), consider the optimization problem maximize subject to

g(x,y) = c. For instance (see Figure 1 on the right), consider the optimization problem maximize subject to 1 of 11 11/29/2010 10:39 AM From Wikipedia, the free encyclopedia In mathematical optimization, the method of Lagrange multipliers (named after Joseph Louis Lagrange) provides a strategy for finding the

More information

Lecture Support Vector Machine (SVM) Classifiers

Lecture Support Vector Machine (SVM) Classifiers Introduction to Machine Learning Lecturer: Amir Globerson Lecture 6 Fall Semester Scribe: Yishay Mansour 6.1 Support Vector Machine (SVM) Classifiers Classification is one of the most important tasks in

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon

More information

Final Overview. Introduction to ML. Marek Petrik 4/25/2017

Final Overview. Introduction to ML. Marek Petrik 4/25/2017 Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,

More information

Gradient Descent. Dr. Xiaowei Huang

Gradient Descent. Dr. Xiaowei Huang Gradient Descent Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Three machine learning algorithms: decision tree learning k-nn linear regression only optimization objectives are discussed,

More information

Linear Algebra. Preliminary Lecture Notes

Linear Algebra. Preliminary Lecture Notes Linear Algebra Preliminary Lecture Notes Adolfo J. Rumbos c Draft date April 29, 23 2 Contents Motivation for the course 5 2 Euclidean n dimensional Space 7 2. Definition of n Dimensional Euclidean Space...........

More information

Introduction to Unconstrained Optimization: Part 2

Introduction to Unconstrained Optimization: Part 2 Introduction to Unconstrained Optimization: Part 2 James Allison ME 555 January 29, 2007 Overview Recap Recap selected concepts from last time (with examples) Use of quadratic functions Tests for positive

More information

Linear Classification and SVM. Dr. Xin Zhang

Linear Classification and SVM. Dr. Xin Zhang Linear Classification and SVM Dr. Xin Zhang Email: eexinzhang@scut.edu.cn What is linear classification? Classification is intrinsically non-linear It puts non-identical things in the same class, so a

More information

CSC321 Lecture 6: Backpropagation

CSC321 Lecture 6: Backpropagation CSC321 Lecture 6: Backpropagation Roger Grosse Roger Grosse CSC321 Lecture 6: Backpropagation 1 / 21 Overview We ve seen that multilayer neural networks are powerful. But how can we actually learn them?

More information

MATHS WORKSHOPS Algebra, Linear Functions and Series. Business School

MATHS WORKSHOPS Algebra, Linear Functions and Series. Business School MATHS WORKSHOPS Algebra, Linear Functions and Series Business School Outline Algebra and Equations Linear Functions Sequences, Series and Limits Summary and Conclusion Outline Algebra and Equations Linear

More information

Linear Regression. Robot Image Credit: Viktoriya Sukhanova 123RF.com

Linear Regression. Robot Image Credit: Viktoriya Sukhanova 123RF.com Linear Regression These slides were assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these

More information

Semidefinite Programming Basics and Applications

Semidefinite Programming Basics and Applications Semidefinite Programming Basics and Applications Ray Pörn, principal lecturer Åbo Akademi University Novia University of Applied Sciences Content What is semidefinite programming (SDP)? How to represent

More information

Lecture 10: Support Vector Machine and Large Margin Classifier

Lecture 10: Support Vector Machine and Large Margin Classifier Lecture 10: Support Vector Machine and Large Margin Classifier Applied Multivariate Analysis Math 570, Fall 2014 Xingye Qiao Department of Mathematical Sciences Binghamton University E-mail: qiao@math.binghamton.edu

More information

Convex optimization. Javier Peña Carnegie Mellon University. Universidad de los Andes Bogotá, Colombia September 2014

Convex optimization. Javier Peña Carnegie Mellon University. Universidad de los Andes Bogotá, Colombia September 2014 Convex optimization Javier Peña Carnegie Mellon University Universidad de los Andes Bogotá, Colombia September 2014 1 / 41 Convex optimization Problem of the form where Q R n convex set: min x f(x) x Q,

More information

Machine Learning Lecture 7

Machine Learning Lecture 7 Course Outline Machine Learning Lecture 7 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Statistical Learning Theory 23.05.2016 Discriminative Approaches (5 weeks) Linear Discriminant

More information

15. Conic optimization

15. Conic optimization L. Vandenberghe EE236C (Spring 216) 15. Conic optimization conic linear program examples modeling duality 15-1 Generalized (conic) inequalities Conic inequality: a constraint x K where K is a convex cone

More information

Lecture 6 : Projected Gradient Descent

Lecture 6 : Projected Gradient Descent Lecture 6 : Projected Gradient Descent EE227C. Lecturer: Professor Martin Wainwright. Scribe: Alvin Wan Consider the following update. x l+1 = Π C (x l α f(x l )) Theorem Say f : R d R is (m, M)-strongly

More information

Quick Introduction to Nonnegative Matrix Factorization

Quick Introduction to Nonnegative Matrix Factorization Quick Introduction to Nonnegative Matrix Factorization Norm Matloff University of California at Davis 1 The Goal Given an u v matrix A with nonnegative elements, we wish to find nonnegative, rank-k matrices

More information

DATA MINING AND MACHINE LEARNING

DATA MINING AND MACHINE LEARNING DATA MINING AND MACHINE LEARNING Lecture 5: Regularization and loss functions Lecturer: Simone Scardapane Academic Year 2016/2017 Table of contents Loss functions Loss functions for regression problems

More information

Nonlinear Optimization Methods for Machine Learning

Nonlinear Optimization Methods for Machine Learning Nonlinear Optimization Methods for Machine Learning Jorge Nocedal Northwestern University University of California, Davis, Sept 2018 1 Introduction We don t really know, do we? a) Deep neural networks

More information

Lecture 5 January 16, 2013

Lecture 5 January 16, 2013 UBC CPSC 536N: Sparse Approximations Winter 2013 Prof. Nick Harvey Lecture 5 January 16, 2013 Scribe: Samira Samadi 1 Combinatorial IPs 1.1 Mathematical programs { min c Linear Program (LP): T x s.t. a

More information

MAT 419 Lecture Notes Transcribed by Eowyn Cenek 6/1/2012

MAT 419 Lecture Notes Transcribed by Eowyn Cenek 6/1/2012 (Homework 1: Chapter 1: Exercises 1-7, 9, 11, 19, due Monday June 11th See also the course website for lectures, assignments, etc) Note: today s lecture is primarily about definitions Lots of definitions

More information