MATH36061 Convex Optimization
|
|
- Mildred Davidson
- 5 years ago
- Views:
Transcription
1 M\cr NA Manchester Numerical Analysis MATH36061 Convex Optimization Martin Lotz School of Mathematics The University of Manchester Manchester, September 26, 2017
2 Outline General information What is optimization? Course overview
3 Outline General information What is optimization? Course overview
4 Organisation The course website can be found under 1 / 27
5 Organisation The course website can be found under Tutorials start in week two, the tutorial class in week one will consist of a review of important concepts and Python. 1 / 27
6 Organisation The course website can be found under Tutorials start in week two, the tutorial class in week one will consist of a review of important concepts and Python. Problem sets consist of two parts: problems to be worked on at home, and problems to be discussed in class. 1 / 27
7 Organisation The course website can be found under Tutorials start in week two, the tutorial class in week one will consist of a review of important concepts and Python. Problem sets consist of two parts: problems to be worked on at home, and problems to be discussed in class. All the material presented in the lecture will be made available online as the course progresses. 1 / 27
8 Organisation The course website can be found under Tutorials start in week two, the tutorial class in week one will consist of a review of important concepts and Python. Problem sets consist of two parts: problems to be worked on at home, and problems to be discussed in class. All the material presented in the lecture will be made available online as the course progresses. Lecture notes will appear on the website each week. 1 / 27
9 Organisation The course website can be found under Tutorials start in week two, the tutorial class in week one will consist of a review of important concepts and Python. Problem sets consist of two parts: problems to be worked on at home, and problems to be discussed in class. All the material presented in the lecture will be made available online as the course progresses. Lecture notes will appear on the website each week. Contact: martin.lotz@manchester.ac.uk 1 / 27
10 Example classes Example classes are there to deepen the understanding of the course material. Problems sheets should ideally be looked at before the example class. 2 / 32
11 Example classes Example classes are there to deepen the understanding of the course material. Problems sheets should ideally be looked at before the example class. In the example classes, you will be given some time to work on the problems, you should ask questions if some parts are not clear, you will get feedback on your attempts at the problems, we will go through a selection of problems and their solutions. 2 / 32
12 Python 3 / 32
13 Python Python is an indispensable tool for this course. 3 / 32
14 Python Python is an indispensable tool for this course. Python is one of the most popular programming languages, used among others by Google, Dropbox, NASA, etc. 3 / 32
15 Python Python is an indispensable tool for this course. Python is one of the most popular programming languages, used among others by Google, Dropbox, NASA, etc. Many Data science jobs require knowledge of Python. 3 / 32
16 Python Python is an indispensable tool for this course. Python is one of the most popular programming languages, used among others by Google, Dropbox, NASA, etc. Many Data science jobs require knowledge of Python. Availability for this course: On SageMathCloud By installing Anaconda Python on your computer 3 / 32
17 Python Python is an indispensable tool for this course. Python is one of the most popular programming languages, used among others by Google, Dropbox, NASA, etc. Many Data science jobs require knowledge of Python. Availability for this course: On SageMathCloud By installing Anaconda Python on your computer The course page contains links to Python documentation 3 / 32
18 Python Example Python code for printing Fibonacci numbers below 25 def f ( a, b ) : c = a + b return c a, b = 0, 1 while b<25: a, b = b, f ( a, b ) p r i n t a 4 / 32
19 CVXPY 5 / 32
20 CVXPY CVXPY is a convex optimization package for Python. 5 / 32
21 CVXPY CVXPY is a convex optimization package for Python. CVXPY is freely available on 5 / 32
22 CVXPY CVXPY is a convex optimization package for Python. CVXPY is freely available on It is easy to install and the documentation provides some examples to get started. On SageMathCloud it is readily available. 5 / 32
23 CVXPY CVXPY is a convex optimization package for Python. CVXPY is freely available on It is easy to install and the documentation provides some examples to get started. On SageMathCloud it is readily available. An overview of Python, Jupyter notebooks, and CVXPY is given during the first tutorial class on Thursday! 5 / 32
24 Mathematical prerequisites The course will make extensive use of results from linear algebra (vectors and matrices in R n ), multivariate calculus (gradients, Hessians). A comprehensive overview of the mathematics needed can be found on the course website. 6 / 32
25 Outline General information What is optimization? Course overview
26 Mathematical optimization A mathematical optimization problem is a problem of the form minimize subject to f(x) x Ω where f is the objective function and Ω R n is a set of constraints. Often the set Ω is defined by those x R n that satisfy some conditions g 1 (x) 0,..., g m (x) 0, where the g i are real-valued functions. A vector x such that f(x ) f(x) for all other x satisfying the constraints is called a solution or minimizer of the problem. 7 / 32
27 Example: least squares Assume a quantity Y depends linearly on predictors X 1,..., X n : where ε is some random error. Y = β 0 + β 1 X β p X p + ε, (3.1) 8 / 32
28 Example: least squares Assume a quantity Y depends linearly on predictors X 1,..., X n : where ε is some random error. Y = β 0 + β 1 X β p X p + ε, (3.1) Example: Y could sales, and X 1,..., X p advertisement budget in different media (TV, internet, radio, billboards, newspapers). 8 / 32
29 Example: least squares Assume a quantity Y depends linearly on predictors X 1,..., X n : where ε is some random error. Y = β 0 + β 1 X β p X p + ε, (3.1) Example: Y could sales, and X 1,..., X p advertisement budget in different media (TV, internet, radio, billboards, newspapers). Goal: determine model parameters β 0,..., β p. 8 / 32
30 Example: least squares Assume a quantity Y depends linearly on predictors X 1,..., X n : where ε is some random error. Y = β 0 + β 1 X β p X p + ε, (3.1) Example: Y could sales, and X 1,..., X p advertisement budget in different media (TV, internet, radio, billboards, newspapers). Goal: determine model parameters β 0,..., β p. What we know is data from n p observations or experiments: y i = β 0 + β 1 x i1 + + β p x ip + ε i, 1 i n. 8 / 32
31 Example: least squares What we know is data from n p observations or experiments: y i = β 0 + β 1 x i1 + + β p x ip + ε i, 1 i n. 9 / 32
32 Example: least squares What we know is data from n p observations or experiments: y i = β 0 + β 1 x i1 + + β p x ip + ε i, 1 i n. Collect the data in matrices and vectors: y 1 y n y =., X = 1 x 11 x 1p x n1 x np, β = β 0 β 1. β p ε 1, ε =. ε n 9 / 32
33 Example: least squares What we know is data from n p observations or experiments: y i = β 0 + β 1 x i1 + + β p x ip + ε i, 1 i n. Collect the data in matrices and vectors: y 1 y n y =., X = 1 x 11 x 1p x n1 x np The relationship can then be written as y = Xβ + ε., β = β 0 β 1. β p ε 1, ε =. ε n 9 / 32
34 Example: least squares Based on the relationship y = Xβ + ε, choose β that minimizes the squared 2-norm of the error, minimize y Xβ / 32
35 Example: least squares Based on the relationship y = Xβ + ε, choose β that minimizes the squared 2-norm of the error, minimize y Xβ 2 2 This is an optimization problem with a quadratic objective function and no constraints. 10 / 32
36 Example: least squares Based on the relationship y = Xβ + ε, choose β that minimizes the squared 2-norm of the error, minimize y Xβ 2 2 This is an optimization problem with a quadratic objective function and no constraints. The problem has a closed form solution β = (X X) 1 X y. 10 / 32
37 Example: least squares Based on the relationship y = Xβ + ε, choose β that minimizes the squared 2-norm of the error, minimize y Xβ 2 2 This is an optimization problem with a quadratic objective function and no constraints. The problem has a closed form solution β = (X X) 1 X y. There are more efficient methods available than using the closed form solution. 10 / 32
38 Application: linear regression Observation: the log of the basal metabolic rate (energy expenditure) rate Y in mammals is related to log adult mass X as Y = β 0 + β 1 X. 11 / 32
39 Application: linear regression Observation: the log of the basal metabolic rate (energy expenditure) rate Y in mammals is related to log adult mass X as Y = β 0 + β 1 X. Collect data from a database 573 mammal species and plot the relationship. 11 / 32
40 Application: linear regression Observation: the log of the basal metabolic rate (energy expenditure) rate Y in mammals is related to log adult mass X as Y = β 0 + β 1 X. Collect data from a database 573 mammal species and plot the relationship. Assemble the vector y, matrix X, and solve the problem minimize y Xβ 2 11 / 32
41 Application: linear regression # C r e a t e a p+1 v e c t o r o f v a r i a b l e s Beta = V a r i a b l e ( p+1) # C r e a t e sum of s q u a r e s o b j e c t i v e f u n c t i o n o b j e c t i v e=minimize ( s u m e n t r i e s ( s q u a r e (X Beta y ) ) ) # C r e a t e problem and s o l v e i t prob = Problem ( o b j e c t i v e ) prob. s o l v e ( ) p l t. p l o t ( bmr [ :, 0 ], bmr [ :, 1 ], o ) xx = np. l i n s p a c e ( 0, 1 4, ) bmr = p l t. p l o t ( xx, Beta [ 0 ]. v a l u e+beta [ 1 ]. v a l u e xx, \ c o l o r= red, l i n e w i d t h =2) p l t. show ( ) 12 / 32
42 Application: linear regression Y = β 0 + β 1 X log Basal metabolic rate log Adult body mass 13 / 32
43 Application: linear regression Y = β0 + β1 X. Figure: Episode Size Matters from TV series Wonders of Life 13 / 32
44 Example: linear programming Linear programming refers to problems of the form maximize c 1 x c n x n subject to a 11 x a 1n x n b 1 a m1 x a mn x n b m x 1 0,..., x n / 32
45 Example: linear programming Linear programming refers to problems of the form maximize c 1 x c n x n subject to a 11 x a 1n x n b 1 a m1 x a mn x n b m x 1 0,..., x n 0. Can be rewritten in standard form shown at the beginning! 14 / 32
46 Example: linear programming Linear programming refers to problems of the form maximize c 1 x c n x n subject to a 11 x a 1n x n b 1 a m1 x a mn x n b m x 1 0,..., x n 0. Can be rewritten in standard form shown at the beginning! Using matrices and vectors: maximize c, x (= c x) subject to Ax b (inequality componentwise) x 0 14 / 32
47 Linear programming: geometry {x : a, x b}. 15 / 32
48 Linear programming: geometry {x : a 1, x b 1,..., a m, x b m } = {x : Ax b}. Polyhedron 16 / 32
49 Linear programming: geometry maximize c, x subject to Ax b. 17 / 32
50 Linear programming: application A cargo plane has two compartments with capacities C 1 = 35 and C 2 = 40 tonnes and volumes V 1 = 250 and V 2 = 400 cubic metres. 18 / 32
51 Linear programming: application A cargo plane has two compartments with capacities C 1 = 35 and C 2 = 40 tonnes and volumes V 1 = 250 and V 2 = 400 cubic metres. Three types of cargo: Volume Weight Profit ( / tonne) Cargo Cargo Cargo / 32
52 Linear programming: application A cargo plane has two compartments with capacities C 1 = 35 and C 2 = 40 tonnes and volumes V 1 = 250 and V 2 = 400 cubic metres. Three types of cargo: Volume Weight Profit ( / tonne) Cargo Cargo Cargo How to distribute the cargo to maximize profit? 18 / 32
53 Linear programming: application Objective function: x ij : amount of cargo i in compartment j f(x) = 300 (x 11 + x 12 ) (x 21 + x 22 ) (x 31 + x 32 ). 19 / 32
54 Linear programming: application x ij : amount of cargo i in compartment j Objective function: f(x) = 300 (x 11 + x 12 ) (x 21 + x 22 ) (x 31 + x 32 ). Constraints: x 11 + x (total amount of cargo 1) x 21 + x (total amount of cargo 2) x 31 + x (total amount of cargo 3) x 11 + x 21 + x (weight constraint on comp. 1) x 12 + x 22 + x (weight constraint on comp. 2) 8x x x (volume constraint on comp. 1) 8x x x (volume constraint on comp. 2) (x 11 + x 21 + x 31 )/35 (x 12 + x 22 + x 32 )/40 = 0 (maintain balance of weight ratio) x ij 0 (cargo can t have negative weight) 19 / 32
55 Linear programming: application As complicated as it looks, it can be solved easily with CVXPY. # D e f i n e a l l t h e m a t r i c e s and v e c t o r s i n v o l v e d c = np. a r r a y ( [ 3 0 0, 3 0 0, 3 5 0, 3 5 0, 2 7 0, ] ) A = np. a r r a y ( [ [ 1, 1, 0, 0, 0, 0 ], [ 0, 0, 1, 1, 0, 0 ], [ 0, 0, 0, 0, 1, 1 ], [ 1, 0, 1, 0, 1, 0 ], [ 0, 1, 0, 1, 0, 1 ], [ 8, 0, 10, 0, 7, 0 ], [ 0, 8, 0, 10, 0, 7 ] ] ) b = np. a r r a y ( [ 2 5, 3 2, 2 8, 3 5, 4 0, 2 5 0, ] ) ; B = np. array ( [ 1 / 35, 1/40, 1/35, 1/40, 1/35, 1/40]); d = np. z e r o s ( 1 ) # C r e a t e v a r i a b l e s, o b j e c t i v e and c o n s t r a i n t s x = V a r i a b l e ( 6 ) c o n s t r a i n t s = [ A x <= b, B x == d, x >= 0 ] o b j e c t i v e = Maximize ( c x ) # C r e a t e a problem u s i n g t h e o b j e c t i v e and c o n s t r a i n t s and s o l v e i t prob = Problem ( o b j e c t i v e, c o n s t r a i n t s ) prob. s o l v e ( ) 20 / 32
56 Linear programming: application As complicated as it looks, it can be solved easily with CVXPY. # D e f i n e a l l t h e m a t r i c e s and v e c t o r s i n v o l v e d c = np. a r r a y ( [ 3 0 0, 3 0 0, 3 5 0, 3 5 0, 2 7 0, ] ) A = np. a r r a y ( [ [ 1, 1, 0, 0, 0, 0 ], [ 0, 0, 1, 1, 0, 0 ], [ 0, 0, 0, 0, 1, 1 ], [ 1, 0, 1, 0, 1, 0 ], [ 0, 1, 0, 1, 0, 1 ], [ 8, 0, 10, 0, 7, 0 ], [ 0, 8, 0, 10, 0, 7 ] ] ) b = np. a r r a y ( [ 2 5, 3 2, 2 8, 3 5, 4 0, 2 5 0, ] ) ; B = np. array ( [ 1 / 35, 1/40, 1/35, 1/40, 1/35, 1/40]); d = np. z e r o s ( 1 ) # C r e a t e v a r i a b l e s, o b j e c t i v e and c o n s t r a i n t s x = V a r i a b l e ( 6 ) c o n s t r a i n t s = [ A x <= b, B x == d, x >= 0 ] o b j e c t i v e = Maximize ( c x ) # C r e a t e a problem u s i n g t h e o b j e c t i v e and c o n s t r a i n t s and s o l v e i t prob = Problem ( o b j e c t i v e, c o n s t r a i n t s ) prob. s o l v e ( ) x 11 = , x 12 = , x 21 = 0, x 22 = 32, x 31 = 28, x 32 = / 32
57 Example: total variation inpainting Optimization methods play an increasingly important role in image and signal processing. 21 / 32
58 Example: total variation inpainting Optimization methods play an increasingly important role in image and signal processing. An image is an m n matrix U, with each entry u ij corresponding to a light intensity or colour. 21 / 32
59 Example: total variation inpainting Optimization methods play an increasingly important role in image and signal processing. An image is an m n matrix U, with each entry u ij corresponding to a light intensity or colour. In the image inpainting problem, one aims to guess the true value of missing or corrupted entries of an image. 21 / 32
60 Example: total variation inpainting Optimization methods play an increasingly important role in image and signal processing. An image is an m n matrix U, with each entry u ij corresponding to a light intensity or colour. In the image inpainting problem, one aims to guess the true value of missing or corrupted entries of an image. A conceptually simple approach is to replace the image with the closest image among a set of images satisfying typical properties. 21 / 32
61 Example: total variation inpainting What are typical properties of a typical image? Images tend to have large homogeneous areas in which the colour doesn t change much; Images have approximately low rank, when interpreted as matrices. The total variation or TV-norm is the sum of the norm of the horizontal and vertical differences, U TV = m n i=1 j=1 (u i+1,j u i,j ) 2 + (u i,j+1 u i,j ) 2, 22 / 32
62 Example: total variation inpainting The TV-norm naturally increases with increased variation or sharp edges in an image U 1 = , U The left matrix has TV-norm U 1 TV = , while the right one has TV-norm U 2 TV = Intuitively, we would expect a natural image with artifacts added to it to have a higher TV norm. 23 / 32
63 Example: total variation inpainting Consider the following corrupted image. 24 / 32
64 Example: total variation inpainting Applying the optimization problem minimize X TV subject to x ij = u ij, (i, j) Ω. we recover an approximation to the true image: Inpainted Image Difference Image 25 / 32
65 Example: machine learning In machine learning, the goal is to automatically learn a function f(x) = y from a training set of pairs (x i, y i ). 26 / 32
66 Example: machine learning In machine learning, the goal is to automatically learn a function f(x) = y from a training set of pairs (x i, y i ). The problem can be posed as an optimization problem min f F m (f(x i ) y i ) 2, i=1 where F is a class of functions. Examples include regression and classification. 26 / 32
67 Application: letter recognition By solving an optimization problem (a multi-layer neural network), the computer learns a function that assigns to an image the closest handwritten letter. 27 / 32
68 Applications of machine learning Handwritten character recognition; Classification of gene expression data; Computer games; Online recommender systems; Sentiment analysis, text mining; Medical diagnosis; Computer vision;... Developments driven by Google DeepMind, Facebook AI, OpenAI, etc. 28 / 32
69 Outline General information What is optimization? Course overview
70 Overview of the course The course will try to balance three points of view: Theory. Optimality conditions, duality theory, convex analysis, geometry of linear, quadratic and semidefinite programming. Algorithms. Descent algorithms, interior-point methods, projected gradient methods, Lagrangian Applications. Logistics and scheduling, finance, signal processing, convex regularization, semidefinite relaxations of hard combinatorial problems, machine learning, compressive sensing. 29 / 32
71 The course will be... Rigorous. We prove Theorems and the occasional Lemma. Computational. We implement examples and see optimization at work. Visual. Convex optimization is a geometric theory, so expect to see lots of pictures. 30 / 32
72 Overview of the course Next class (Thursday 9-11, Univ. Place 6.212): Unconstrained optimization: gradient descent; Introduction to the computational tools of this course. 31 / 32
73 See you next time! Caged parrot Free parrot minimize X TV subject to x ij = u ij, (i, j) Ω. 32 / 32
Math for Machine Learning Open Doors to Data Science and Artificial Intelligence. Richard Han
Math for Machine Learning Open Doors to Data Science and Artificial Intelligence Richard Han Copyright 05 Richard Han All rights reserved. CONTENTS PREFACE... - INTRODUCTION... LINEAR REGRESSION... 4 LINEAR
More informationTwo hours. To be provided by Examinations Office: Mathematical Formula Tables. THE UNIVERSITY OF MANCHESTER. xx xxxx 2017 xx:xx xx.
Two hours To be provided by Examinations Office: Mathematical Formula Tables. THE UNIVERSITY OF MANCHESTER CONVEX OPTIMIZATION - SOLUTIONS xx xxxx 27 xx:xx xx.xx Answer THREE of the FOUR questions. If
More informationSupport Vector Machines (SVM) in bioinformatics. Day 1: Introduction to SVM
1 Support Vector Machines (SVM) in bioinformatics Day 1: Introduction to SVM Jean-Philippe Vert Bioinformatics Center, Kyoto University, Japan Jean-Philippe.Vert@mines.org Human Genome Center, University
More informationOverview. Linear Regression with One Variable Gradient Descent Linear Regression with Multiple Variables Gradient Descent with Multiple Variables
Overview Linear Regression with One Variable Gradient Descent Linear Regression with Multiple Variables Gradient Descent with Multiple Variables Example: Advertising Data Data taken from An Introduction
More informationMATH20602 Numerical Analysis 1
M\cr NA Manchester Numerical Analysis MATH20602 Numerical Analysis 1 Martin Lotz School of Mathematics The University of Manchester Manchester, February 1, 2016 Outline General Course Information Introduction
More informationLecture Unconstrained optimization. In this lecture we will study the unconstrained problem. minimize f(x), (2.1)
Lecture 2 In this lecture we will study the unconstrained problem minimize f(x), (2.1) where x R n. Optimality conditions aim to identify properties that potential minimizers need to satisfy in relation
More informationMachine Learning for Physicists Lecture 1
Machine Learning for Physicists Lecture 1 Summer 2017 University of Erlangen-Nuremberg Florian Marquardt (Image generated by a net with 20 hidden layers) OUTPUT INPUT (Picture: Wikimedia Commons) OUTPUT
More informationCS295: Convex Optimization. Xiaohui Xie Department of Computer Science University of California, Irvine
CS295: Convex Optimization Xiaohui Xie Department of Computer Science University of California, Irvine Course information Prerequisites: multivariate calculus and linear algebra Textbook: Convex Optimization
More informationCPSC 340: Machine Learning and Data Mining
CPSC 340: Machine Learning and Data Mining Linear Classifiers: predictions Original version of these slides by Mark Schmidt, with modifications by Mike Gelbart. 1 Admin Assignment 4: Due Friday of next
More informationScientific Computing: Optimization
Scientific Computing: Optimization Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 Course MATH-GA.2043 or CSCI-GA.2112, Spring 2012 March 8th, 2011 A. Donev (Courant Institute) Lecture
More informationMustafa Jarrar. Machine Learning Linear Regression. Birzeit University. Mustafa Jarrar: Lecture Notes on Linear Regression Birzeit University, 2018
Mustafa Jarrar: Lecture Notes on Linear Regression Birzeit University, 2018 Version 1 Machine Learning Linear Regression Mustafa Jarrar Birzeit University 1 Watch this lecture and download the slides Course
More informationLECTURE 03: LINEAR REGRESSION PT. 1. September 18, 2017 SDS 293: Machine Learning
LECTURE 03: LINEAR REGRESSION PT. 1 September 18, 2017 SDS 293: Machine Learning Announcements Need help with? Visit the Stats TAs! Sunday Thursday evenings 7 9 pm in Burton 301 (SDS293 alum available
More informationCSC321 Lecture 2: Linear Regression
CSC32 Lecture 2: Linear Regression Roger Grosse Roger Grosse CSC32 Lecture 2: Linear Regression / 26 Overview First learning algorithm of the course: linear regression Task: predict scalar-valued targets,
More informationCS 6375 Machine Learning
CS 6375 Machine Learning Nicholas Ruozzi University of Texas at Dallas Slides adapted from David Sontag and Vibhav Gogate Course Info. Instructor: Nicholas Ruozzi Office: ECSS 3.409 Office hours: Tues.
More informationMATH20602 Numerical Analysis 1
M\cr NA Manchester Numerical Analysis MATH20602 Numerical Analysis 1 Martin Lotz School of Mathematics The University of Manchester Manchester, January 27, 2014 Outline General Course Information Introduction
More informationMachine Learning Support Vector Machines. Prof. Matteo Matteucci
Machine Learning Support Vector Machines Prof. Matteo Matteucci Discriminative vs. Generative Approaches 2 o Generative approach: we derived the classifier from some generative hypothesis about the way
More informationLecture 9: Large Margin Classifiers. Linear Support Vector Machines
Lecture 9: Large Margin Classifiers. Linear Support Vector Machines Perceptrons Definition Perceptron learning rule Convergence Margin & max margin classifiers (Linear) support vector machines Formulation
More informationConvex Optimization M2
Convex Optimization M2 Lecture 8 A. d Aspremont. Convex Optimization M2. 1/57 Applications A. d Aspremont. Convex Optimization M2. 2/57 Outline Geometrical problems Approximation problems Combinatorial
More informationLecture 2: Linear regression
Lecture 2: Linear regression Roger Grosse 1 Introduction Let s ump right in and look at our first machine learning algorithm, linear regression. In regression, we are interested in predicting a scalar-valued
More informationHomework 4. Convex Optimization /36-725
Homework 4 Convex Optimization 10-725/36-725 Due Friday November 4 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)
More informationMachine Learning And Applications: Supervised Learning-SVM
Machine Learning And Applications: Supervised Learning-SVM Raphaël Bournhonesque École Normale Supérieure de Lyon, Lyon, France raphael.bournhonesque@ens-lyon.fr 1 Supervised vs unsupervised learning Machine
More informationRegression with Numerical Optimization. Logistic
CSG220 Machine Learning Fall 2008 Regression with Numerical Optimization. Logistic regression Regression with Numerical Optimization. Logistic regression based on a document by Andrew Ng October 3, 204
More informationDS Machine Learning and Data Mining I. Alina Oprea Associate Professor, CCIS Northeastern University
DS 4400 Machine Learning and Data Mining I Alina Oprea Associate Professor, CCIS Northeastern University January 17 2019 Logistics HW 1 is on Piazza and Gradescope Deadline: Friday, Jan. 25, 2019 Office
More informationProbabilistic Machine Learning. Industrial AI Lab.
Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear
More informationOptimeringslära för F (SF1811) / Optimization (SF1841)
Optimeringslära för F (SF1811) / Optimization (SF1841) 1. Information about the course 2. Examples of optimization problems 3. Introduction to linear programming Introduction - Per Enqvist 1 Linear programming
More informationLearning by constraints and SVMs (2)
Statistical Techniques in Robotics (16-831, F12) Lecture#14 (Wednesday ctober 17) Learning by constraints and SVMs (2) Lecturer: Drew Bagnell Scribe: Albert Wu 1 1 Support Vector Ranking Machine pening
More informationSupport Vector Machines: Maximum Margin Classifiers
Support Vector Machines: Maximum Margin Classifiers Machine Learning and Pattern Recognition: September 16, 2008 Piotr Mirowski Based on slides by Sumit Chopra and Fu-Jie Huang 1 Outline What is behind
More informationLinear Regression. S. Sumitra
Linear Regression S Sumitra Notations: x i : ith data point; x T : transpose of x; x ij : ith data point s jth attribute Let {(x 1, y 1 ), (x, y )(x N, y N )} be the given data, x i D and y i Y Here D
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationCOMP 652: Machine Learning. Lecture 12. COMP Lecture 12 1 / 37
COMP 652: Machine Learning Lecture 12 COMP 652 Lecture 12 1 / 37 Today Perceptrons Definition Perceptron learning rule Convergence (Linear) support vector machines Margin & max margin classifier Formulation
More informationLinear Regression. CSL603 - Fall 2017 Narayanan C Krishnan
Linear Regression CSL603 - Fall 2017 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Univariate regression Multivariate regression Probabilistic view of regression Loss functions Bias-Variance analysis Regularization
More informationStructured matrix factorizations. Example: Eigenfaces
Structured matrix factorizations Example: Eigenfaces An extremely large variety of interesting and important problems in machine learning can be formulated as: Given a matrix, find a matrix and a matrix
More informationLinear Regression. CSL465/603 - Fall 2016 Narayanan C Krishnan
Linear Regression CSL465/603 - Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Outline Univariate regression Multivariate regression Probabilistic view of regression Loss functions Bias-Variance analysis
More informationLast updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship
More informationOnline Videos FERPA. Sign waiver or sit on the sides or in the back. Off camera question time before and after lecture. Questions?
Online Videos FERPA Sign waiver or sit on the sides or in the back Off camera question time before and after lecture Questions? Lecture 1, Slide 1 CS224d Deep NLP Lecture 4: Word Window Classification
More informationCompressed Sensing and Neural Networks
and Jan Vybíral (Charles University & Czech Technical University Prague, Czech Republic) NOMAD Summer Berlin, September 25-29, 2017 1 / 31 Outline Lasso & Introduction Notation Training the network Applications
More informationReview of Optimization Methods
Review of Optimization Methods Prof. Manuela Pedio 20550 Quantitative Methods for Finance August 2018 Outline of the Course Lectures 1 and 2 (3 hours, in class): Linear and non-linear functions on Limits,
More informationHOMEWORK #4: LOGISTIC REGRESSION
HOMEWORK #4: LOGISTIC REGRESSION Probabilistic Learning: Theory and Algorithms CS 274A, Winter 2019 Due: 11am Monday, February 25th, 2019 Submit scan of plots/written responses to Gradebook; submit your
More informationTraining the linear classifier
215, Training the linear classifier A natural way to train the classifier is to minimize the number of classification errors on the training data, i.e. choosing w so that the training error is minimized.
More informationHomework 5. Convex Optimization /36-725
Homework 5 Convex Optimization 10-725/36-725 Due Tuesday November 22 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)
More informationMachine Learning (CS 567) Lecture 5
Machine Learning (CS 567) Lecture 5 Time: T-Th 5:00pm - 6:20pm Location: GFS 118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol
More informationIE598 Big Data Optimization Introduction
IE598 Big Data Optimization Introduction Instructor: Niao He Jan 17, 2018 1 A little about me Assistant Professor, ISE & CSL UIUC, 2016 Ph.D. in Operations Research, M.S. in Computational Sci. & Eng. Georgia
More informationMini-project in scientific computing
Mini-project in scientific computing Eran Treister Computer Science Department, Ben-Gurion University of the Negev, Israel. March 7, 2018 1 / 30 Scientific computing Involves the solution of large computational
More informationSupport Vector Machines
Support Vector Machines Le Song Machine Learning I CSE 6740, Fall 2013 Naïve Bayes classifier Still use Bayes decision rule for classification P y x = P x y P y P x But assume p x y = 1 is fully factorized
More informationNeural Networks and Deep Learning
Neural Networks and Deep Learning Professor Ameet Talwalkar November 12, 2015 Professor Ameet Talwalkar Neural Networks and Deep Learning November 12, 2015 1 / 16 Outline 1 Review of last lecture AdaBoost
More informationLinear Regression and Its Applications
Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start
More informationMachine Learning and Data Mining. Linear classification. Kalev Kask
Machine Learning and Data Mining Linear classification Kalev Kask Supervised learning Notation Features x Targets y Predictions ŷ = f(x ; q) Parameters q Program ( Learner ) Learning algorithm Change q
More informationFoundation of Intelligent Systems, Part I. SVM s & Kernel Methods
Foundation of Intelligent Systems, Part I SVM s & Kernel Methods mcuturi@i.kyoto-u.ac.jp FIS - 2013 1 Support Vector Machines The linearly-separable case FIS - 2013 2 A criterion to select a linear classifier:
More informationMachine Learning Basics Lecture 4: SVM I. Princeton University COS 495 Instructor: Yingyu Liang
Machine Learning Basics Lecture 4: SVM I Princeton University COS 495 Instructor: Yingyu Liang Review: machine learning basics Math formulation Given training data x i, y i : 1 i n i.i.d. from distribution
More informationLeast Mean Squares Regression
Least Mean Squares Regression Machine Learning Spring 2018 The slides are mainly from Vivek Srikumar 1 Lecture Overview Linear classifiers What functions do linear classifiers express? Least Squares Method
More informationWelcome to CPSC 4850/ OR Algorithms
Welcome to CPSC 4850/5850 - OR Algorithms 1 Course Outline 2 Operations Research definition 3 Modeling Problems Product mix Transportation 4 Using mathematical programming Course Outline Instructor: Robert
More informationData Science - Convex optimization and application
1 Data Science - Convex optimization and application Data Science - Convex optimization and application Summary We begin by some illustrations in challenging topics in modern data science. Then, this session
More informationWelcome to the Machine Learning Practical Deep Neural Networks. MLP Lecture 1 / 18 September 2018 Single Layer Networks (1) 1
Welcome to the Machine Learning Practical Deep Neural Networks MLP Lecture 1 / 18 September 2018 Single Layer Networks (1) 1 Introduction to MLP; Single Layer Networks (1) Steve Renals Machine Learning
More informationECE521 W17 Tutorial 1. Renjie Liao & Min Bai
ECE521 W17 Tutorial 1 Renjie Liao & Min Bai Schedule Linear Algebra Review Matrices, vectors Basic operations Introduction to TensorFlow NumPy Computational Graphs Basic Examples Linear Algebra Review
More informationLecture 9: September 28
0-725/36-725: Convex Optimization Fall 206 Lecturer: Ryan Tibshirani Lecture 9: September 28 Scribes: Yiming Wu, Ye Yuan, Zhihao Li Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These
More informationStatistical NLP for the Web
Statistical NLP for the Web Neural Networks, Deep Belief Networks Sameer Maskey Week 8, October 24, 2012 *some slides from Andrew Rosenberg Announcements Please ask HW2 related questions in courseworks
More informationMachine Learning Brett Bernstein. Recitation 1: Gradients and Directional Derivatives
Machine Learning Brett Bernstein Recitation 1: Gradients and Directional Derivatives Intro Question 1 We are given the data set (x 1, y 1 ),, (x n, y n ) where x i R d and y i R We want to fit a linear
More informationAlgebra 1 Bassam Raychouni Grade 8 Math. Greenwood International School Course Description. Course Title: Head of Department:
Course Title: Head of Department: Bassam Raychouni (bassam@greenwood.sh.ae) Teacher(s) + e-mail: Femi Antony (femi.a@greenwood.sch.ae) Cycle/Division: High School Grade Level: 9 or 10 Credit Unit: 1 Duration:
More informationCS145: INTRODUCTION TO DATA MINING
CS145: INTRODUCTION TO DATA MINING 5: Vector Data: Support Vector Machine Instructor: Yizhou Sun yzsun@cs.ucla.edu October 18, 2017 Homework 1 Announcements Due end of the day of this Thursday (11:59pm)
More informationDeep Learning for Computer Vision
Deep Learning for Computer Vision Lecture 4: Curse of Dimensionality, High Dimensional Feature Spaces, Linear Classifiers, Linear Regression, Python, and Jupyter Notebooks Peter Belhumeur Computer Science
More informationCSC321 Lecture 4: Learning a Classifier
CSC321 Lecture 4: Learning a Classifier Roger Grosse Roger Grosse CSC321 Lecture 4: Learning a Classifier 1 / 28 Overview Last time: binary classification, perceptron algorithm Limitations of the perceptron
More informationOptimization for Machine Learning
Optimization for Machine Learning Elman Mansimov 1 September 24, 2015 1 Modified based on Shenlong Wang s and Jake Snell s tutorials, with additional contents borrowed from Kevin Swersky and Jasper Snoek
More informationML4NLP Multiclass Classification
ML4NLP Multiclass Classification CS 590NLP Dan Goldwasser Purdue University dgoldwas@purdue.edu Social NLP Last week we discussed the speed-dates paper. Interesting perspective on NLP problems- Can we
More informationNeural Networks with Applications to Vision and Language. Feedforward Networks. Marco Kuhlmann
Neural Networks with Applications to Vision and Language Feedforward Networks Marco Kuhlmann Feedforward networks Linear separability x 2 x 2 0 1 0 1 0 0 x 1 1 0 x 1 linearly separable not linearly separable
More informationWe choose parameter values that will minimize the difference between the model outputs & the true function values.
CSE 4502/5717 Big Data Analytics Lecture #16, 4/2/2018 with Dr Sanguthevar Rajasekaran Notes from Yenhsiang Lai Machine learning is the task of inferring a function, eg, f : R " R This inference has to
More informationMIT Algebraic techniques and semidefinite optimization February 14, Lecture 3
MI 6.97 Algebraic techniques and semidefinite optimization February 4, 6 Lecture 3 Lecturer: Pablo A. Parrilo Scribe: Pablo A. Parrilo In this lecture, we will discuss one of the most important applications
More informationLinear Regression (continued)
Linear Regression (continued) Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 6, 2017 1 / 39 Outline 1 Administration 2 Review of last lecture 3 Linear regression
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning First-Order Methods, L1-Regularization, Coordinate Descent Winter 2016 Some images from this lecture are taken from Google Image Search. Admin Room: We ll count final numbers
More informationConvex Optimization / Homework 1, due September 19
Convex Optimization 1-725/36-725 Homework 1, due September 19 Instructions: You must complete Problems 1 3 and either Problem 4 or Problem 5 (your choice between the two). When you submit the homework,
More informationCSE 250a. Assignment Noisy-OR model. Out: Tue Oct 26 Due: Tue Nov 2
CSE 250a. Assignment 4 Out: Tue Oct 26 Due: Tue Nov 2 4.1 Noisy-OR model X 1 X 2 X 3... X d Y For the belief network of binary random variables shown above, consider the noisy-or conditional probability
More informationEE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 18
EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 18 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory May 31, 2012 Andre Tkacenko
More informationA Quick Tour of Linear Algebra and Optimization for Machine Learning
A Quick Tour of Linear Algebra and Optimization for Machine Learning Masoud Farivar January 8, 2015 1 / 28 Outline of Part I: Review of Basic Linear Algebra Matrices and Vectors Matrix Multiplication Operators
More informationIntroduction to Machine Learning. Regression. Computer Science, Tel-Aviv University,
1 Introduction to Machine Learning Regression Computer Science, Tel-Aviv University, 2013-14 Classification Input: X Real valued, vectors over real. Discrete values (0,1,2,...) Other structures (e.g.,
More informationLinear Models in Machine Learning
CS540 Intro to AI Linear Models in Machine Learning Lecturer: Xiaojin Zhu jerryzhu@cs.wisc.edu We briefly go over two linear models frequently used in machine learning: linear regression for, well, regression,
More informationDS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.
DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1
More informationCSC 411 Lecture 17: Support Vector Machine
CSC 411 Lecture 17: Support Vector Machine Ethan Fetaya, James Lucas and Emad Andrews University of Toronto CSC411 Lec17 1 / 1 Today Max-margin classification SVM Hard SVM Duality Soft SVM CSC411 Lec17
More informationSVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning
SVAN 2016 Mini Course: Stochastic Convex Optimization Methods in Machine Learning Mark Schmidt University of British Columbia, May 2016 www.cs.ubc.ca/~schmidtm/svan16 Some images from this lecture are
More informationEE613 Machine Learning for Engineers. Kernel methods Support Vector Machines. jean-marc odobez 2015
EE613 Machine Learning for Engineers Kernel methods Support Vector Machines jean-marc odobez 2015 overview Kernel methods introductions and main elements defining kernels Kernelization of k-nn, K-Means,
More informationg(x,y) = c. For instance (see Figure 1 on the right), consider the optimization problem maximize subject to
1 of 11 11/29/2010 10:39 AM From Wikipedia, the free encyclopedia In mathematical optimization, the method of Lagrange multipliers (named after Joseph Louis Lagrange) provides a strategy for finding the
More informationLecture Support Vector Machine (SVM) Classifiers
Introduction to Machine Learning Lecturer: Amir Globerson Lecture 6 Fall Semester Scribe: Yishay Mansour 6.1 Support Vector Machine (SVM) Classifiers Classification is one of the most important tasks in
More informationA direct formulation for sparse PCA using semidefinite programming
A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon
More informationFinal Overview. Introduction to ML. Marek Petrik 4/25/2017
Final Overview Introduction to ML Marek Petrik 4/25/2017 This Course: Introduction to Machine Learning Build a foundation for practice and research in ML Basic machine learning concepts: max likelihood,
More informationGradient Descent. Dr. Xiaowei Huang
Gradient Descent Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Three machine learning algorithms: decision tree learning k-nn linear regression only optimization objectives are discussed,
More informationLinear Algebra. Preliminary Lecture Notes
Linear Algebra Preliminary Lecture Notes Adolfo J. Rumbos c Draft date April 29, 23 2 Contents Motivation for the course 5 2 Euclidean n dimensional Space 7 2. Definition of n Dimensional Euclidean Space...........
More informationIntroduction to Unconstrained Optimization: Part 2
Introduction to Unconstrained Optimization: Part 2 James Allison ME 555 January 29, 2007 Overview Recap Recap selected concepts from last time (with examples) Use of quadratic functions Tests for positive
More informationLinear Classification and SVM. Dr. Xin Zhang
Linear Classification and SVM Dr. Xin Zhang Email: eexinzhang@scut.edu.cn What is linear classification? Classification is intrinsically non-linear It puts non-identical things in the same class, so a
More informationCSC321 Lecture 6: Backpropagation
CSC321 Lecture 6: Backpropagation Roger Grosse Roger Grosse CSC321 Lecture 6: Backpropagation 1 / 21 Overview We ve seen that multilayer neural networks are powerful. But how can we actually learn them?
More informationMATHS WORKSHOPS Algebra, Linear Functions and Series. Business School
MATHS WORKSHOPS Algebra, Linear Functions and Series Business School Outline Algebra and Equations Linear Functions Sequences, Series and Limits Summary and Conclusion Outline Algebra and Equations Linear
More informationLinear Regression. Robot Image Credit: Viktoriya Sukhanova 123RF.com
Linear Regression These slides were assembled by Eric Eaton, with grateful acknowledgement of the many others who made their course materials freely available online. Feel free to reuse or adapt these
More informationSemidefinite Programming Basics and Applications
Semidefinite Programming Basics and Applications Ray Pörn, principal lecturer Åbo Akademi University Novia University of Applied Sciences Content What is semidefinite programming (SDP)? How to represent
More informationLecture 10: Support Vector Machine and Large Margin Classifier
Lecture 10: Support Vector Machine and Large Margin Classifier Applied Multivariate Analysis Math 570, Fall 2014 Xingye Qiao Department of Mathematical Sciences Binghamton University E-mail: qiao@math.binghamton.edu
More informationConvex optimization. Javier Peña Carnegie Mellon University. Universidad de los Andes Bogotá, Colombia September 2014
Convex optimization Javier Peña Carnegie Mellon University Universidad de los Andes Bogotá, Colombia September 2014 1 / 41 Convex optimization Problem of the form where Q R n convex set: min x f(x) x Q,
More informationMachine Learning Lecture 7
Course Outline Machine Learning Lecture 7 Fundamentals (2 weeks) Bayes Decision Theory Probability Density Estimation Statistical Learning Theory 23.05.2016 Discriminative Approaches (5 weeks) Linear Discriminant
More information15. Conic optimization
L. Vandenberghe EE236C (Spring 216) 15. Conic optimization conic linear program examples modeling duality 15-1 Generalized (conic) inequalities Conic inequality: a constraint x K where K is a convex cone
More informationLecture 6 : Projected Gradient Descent
Lecture 6 : Projected Gradient Descent EE227C. Lecturer: Professor Martin Wainwright. Scribe: Alvin Wan Consider the following update. x l+1 = Π C (x l α f(x l )) Theorem Say f : R d R is (m, M)-strongly
More informationQuick Introduction to Nonnegative Matrix Factorization
Quick Introduction to Nonnegative Matrix Factorization Norm Matloff University of California at Davis 1 The Goal Given an u v matrix A with nonnegative elements, we wish to find nonnegative, rank-k matrices
More informationDATA MINING AND MACHINE LEARNING
DATA MINING AND MACHINE LEARNING Lecture 5: Regularization and loss functions Lecturer: Simone Scardapane Academic Year 2016/2017 Table of contents Loss functions Loss functions for regression problems
More informationNonlinear Optimization Methods for Machine Learning
Nonlinear Optimization Methods for Machine Learning Jorge Nocedal Northwestern University University of California, Davis, Sept 2018 1 Introduction We don t really know, do we? a) Deep neural networks
More informationLecture 5 January 16, 2013
UBC CPSC 536N: Sparse Approximations Winter 2013 Prof. Nick Harvey Lecture 5 January 16, 2013 Scribe: Samira Samadi 1 Combinatorial IPs 1.1 Mathematical programs { min c Linear Program (LP): T x s.t. a
More informationMAT 419 Lecture Notes Transcribed by Eowyn Cenek 6/1/2012
(Homework 1: Chapter 1: Exercises 1-7, 9, 11, 19, due Monday June 11th See also the course website for lectures, assignments, etc) Note: today s lecture is primarily about definitions Lots of definitions
More information